Bioequivalence and Bioavailability Forum • sampleN.TOST vs. sampleN.scABEL

BEQool
★

2024-01-29 12:53
(526 d 08:45 ago)

Posting: # 23845
Views: 6,187

sampleN.TOST vs. sampleN.scABEL [Power / Sample Size]

Hello!

I have searched the forum but couldn't find the answer to the following question: why does the sample size estimation with R with package PowerTOST differ between sampleN.TOST and sampleN.scABEL when CV=30%?

Lets take a look at the following example:

a) Sample size estimation with sampleN.TOST

sampleN.TOST(CV=0.3, theta0=0.95, design="2x3x3")



+++++++++++ Equivalence test - TOST +++++++++++

            Sample size estimation

-----------------------------------------------

Study design: 2x3x3 (partial replicate)

log-transformed data (multiplicative model)



alpha = 0.05, target power = 0.8

BE margins = 0.8 ... 1.25

True ratio = 0.95,  CV = 0.3

Sample size (total)

 n     power

30   0.820400

b) Sample size estimation with sampleN.scABEL

sampleN.scABEL(CV=0.3, theta0=0.95, design="2x3x3")



+++++++++++ scaled (widened) ABEL +++++++++++

            Sample size estimation

   (simulation based on ANOVA evaluation)

---------------------------------------------

Study design: 2x3x3 (partial replicate)

log-transformed data (multiplicative model)

1e+05 studies for each step simulated.



alpha  = 0.05, target power = 0.8

CVw(T) = 0.3; CVw(R) = 0.3

True ratio = 0.95

ABE limits / PE constraint = 0.8 ... 1.25

EMA regulatory settings

- CVswitch            = 0.3

- cap on scABEL if CVw(R) > 0.5

- regulatory constant = 0.76

- pe constraint applied

Sample size search

 n     power

24   0.7814 

27   0.8257

So why do the sample size estimations differ? They have the same arguments (design="2x3x3", theta0=0.95 ...). CV is 30% so there should be no scaling (conventional BE limits, i.e., 80.00-125.00). Does it have to do anything with simulations? But even when I increase number of simulations, the differences aren't that big.

Or if I reformulate the question, why dont the following powers match:

a)

power.TOST(CV=0.3, theta0=0.95,design="2x3x3", n=30)

[1] 0.8204004

power.scABEL(CV=0.3, theta0=0.95, design="2x3x3", n=30)

[1] 0.85977

Regards
BEQool

Edit: Category changed. [Helmut]

Helmut
★★★

Vienna, Austria,
2024-01-29 14:52
(526 d 06:46 ago)

@ BEQool
Posting: # 23846
Views: 5,132

ABEL is a framework (decision scheme)

Post reply

Hi BEQool,

❝ […]: why does the sample size estimation with R with package PowerTOST differ between sampleN.TOST and sampleN.scABEL when CV=30%?

Contrary to ABE, where power (and thus, the sample size) can be obtained by closed formulas, ABEL is a framework, which depends on the realized (observed) CV_wR, the upper constraint uc = 50%, and the point estimate.

[image]

Therefore, simulations are needed. At a true CV_wR = 30% we have – roughly* – a 50% chance that in the actual study CV_wR > 30%. Then we could expand the limits, gain power for a given sample size, or need less subjects for a certain power than we would need in ABE.

❝ So why do the sample size estimations differ? They have the same arguments (design="2x3x3", theta0=0.95 ...). CV is 30% so there should be no scaling (conventional BE limits, i.e., 80.00-125.00).

Nope. See above.

You could also remove all scaling conditions. Then you get the same sample size than with sampleN.TOST():

reg <- reg_const(regulator = "USER", r_const = 1, CVswitch = Inf, CVcap = Inf, pe_constr = FALSE)

sampleN.scABEL(CV = 0.3, theta0 = 0.95, design = "2x3x3", regulator = reg, details = FALSE)

+++++++++++ scaled (widened) ABEL +++++++++++

            Sample size estimation

   (simulation based on ANOVA evaluation)

---------------------------------------------

Study design: 2x3x3 (partial replicate)

log-transformed data (multiplicative model)

1e+05 studies for each step simulated.



alpha  = 0.05, target power = 0.8

CVw(T) = 0.3; CVw(R) = 0.3

True ratio = 0.95

ABE limits / PE constraint = 0.8 ... 1.25

USER defined regulatory settings

- CVswitch            = Inf

- no cap on scABEL

- regulatory constant = 1

- no pe constraint



Sample size

 n     power

30   0.8196

❝ Does it have to do anything with simulations? But even when I increase number of simulations, the differences aren't that big.

No again.

❝ Or if I reformulate the question, why dont the following powers match:

❝ a) power.TOST(CV=0.3, theta0=0.95,design="2x3x3", n=30)

❝ [1] 0.8204004

❝ b) power.scABEL(CV=0.3, theta0=0.95, design="2x3x3", n=30)

❝ [1] 0.85977

You asked the wrong question in b) because

power.scABEL(CV=0.3, theta0=0.95, design="2x3x3", n=27)

[1] 0.82566

Let’s consider an example where expanding the limits is less likely.

ABE (exact)
sampleN.TOST(CV = 0.25, theta0 = 0.95, design = "2x3x3") +++++++++++ Equivalence test - TOST +++++++++++ Sample size estimation ----------------------------------------------- Study design: 2x3x3 (partial replicate) log-transformed data (multiplicative model) alpha = 0.05, target power = 0.8 BE margins = 0.8 ... 1.25 True ratio = 0.95, CV = 0.25 Sample size (total) n power 21 0.814342
ABEL (simulations)
Note that for the partial replicate design you should use subject simulations by the function sampleN.scABEL.sdsims() instead of simulating the associated statistics by the function sampleN.scABEL().
sampleN.scABEL.sdsims(CV = 0.25, theta0 = 0.95, design = "2x3x3", details = FALSE) +++++++++++ scaled (widened) ABEL +++++++++++ Sample size estimation (simulation based on ANOVA evaluation) --------------------------------------------- Study design: 2x3x3 (partial replicate) log-transformed data (multiplicative model) 1e+05 studies for each step simulated. alpha = 0.05, target power = 0.8 CVw(T) = 0.25; CVw(R) = 0.25 True ratio = 0.95 ABE limits / PE constraint = 0.8 ... 1.25 Regulatory settings: EMA Sample size n power 21 0.8223

Now the sample sizes are identical. Power for ABEL is slightly larger than for ABE because there is a certain – while small – chance to expand the limits.

BTW, I would never ever assume theta = 0.95 for a HVD(P). That’s why in the reference-scaling functions of PowerTOST theta = 0.90 is the default.

The simulated variance $\small{s_\text{wR}^2}$ depends on its associated $\small{\chi^2}$-distribution with $\small{n-2}$ degrees of freedom. The $\small{\chi^2}$-distribution is skewed to the right. Hence, when simulating any $\small{CV_\text{wR}=x}$, you will obtain slightly more values $\small{>x}$ than $\small{<x}$.
Try this to show the asymmetric distribution:
n <- 1e7 df <- 27 - 2 set.seed(1234567) x <- rchisq(n = n, df = df) y <- hist(x, breaks = "FD", plot = FALSE) plot(y, freq = FALSE, col = "lightblue", border = NA, xlim = c(0, max(y$mids)), cex.main = 1, las = 1, xlab = bquote("Number of random samples of "*chi^2 == .(n)), main = bquote(italic(df) == .(df))) abline(v = y$mids[y$density == max(y$density)], col = "blue") box()

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

BEQool
★

2024-01-30 11:52
(525 d 09:46 ago)

@ Helmut
Posting: # 23851
Views: 5,116

ABEL is a framework (decision scheme)

Post reply

Thank you for the explanation, it seems I havent taken the distribution of CV into account (jumping over and below the limit CV=30%).

❝ Therefore, simulations are needed. At a true CV_wR = 30% we have – roughly* – a 50% chance that in the actual study CV_wR > 30%. Then we could expand the limits, gain power for a given sample size, or need less subjects for a certain power than we would need in ABE.

❝ […] Now the sample sizes are identical. Power for ABEL is slightly larger than for ABE because there is a certain – while small – chance to expand the limits.

So the sample size with ABEL is always smaller or equal to ABE right (with the same arguments)? Similarly, power for the same number of subjects is always higher (or equal) with ABEL than with ABE (as shown in my reformulated question at the end of the first post or in your example with n=21 for ABE vs. ABEL)?

Additional 2 questions popped into my head while thinking about this:

So in case regulators (EMA region) ask us to calculate post hoc power (regardless of how irrelevant it is) of a study with 2x3x3 design with CVwr=25%, we should calculate it with power.TOST for AUC and for Cmax when we didnt mention anything about widened limits (ABEL) in the protocol (hypothetical scenario)? And on the other hand, for Cmax we should calculate it with power.scABEL when we mentioned widened limits (ABEL) in the protocol?
Another hypothetical scenario: If we get information from the literature about a drug's CVw (Cmax) of around 30% (lets say a range of 25-35%) and if we get a drug's CVw of 22% from our pilot study, can we then do a regular study with design 2x3x3 (in case we get CVw for Cmax 35% so then we could widen the limits and use ABEL)? And if then our drug's CVw from this regular study is lets say 21%, can agencies ask us to justify replicate 2x3x3 design as if why didnt we use conventional 2x2x2 design if we got CVw=22% in our pilot study?

BEQool

Helmut
★★★

Vienna, Austria,
2024-01-30 14:08
(525 d 07:30 ago)

@ BEQool
Posting: # 23852
Views: 4,988

ABEL vs. ABE

Post reply

Hi BEQool,

❝ So the sample size with ABEL is always smaller or equal to ABE right (with the same arguments)? Similarly, power for the same number of subjects is always higher (or equal) with ABEL than with ABE (as shown in my reformulated question at the end of the first post or in your example with n=21 for ABE vs. ABEL)?

Exactly. For details see also this article for a comparison of ABE, ABEL, and RSABE.

❝ 1. So in case regulators (EMA region) ask us to calculate post hoc power (regardless of how irrelevant it is) …

Oh dear! EMA region, really‽ Outright bizarre.

❝ … of a study with 2x3x3 design with CVwr=25%, we should calculate it with power.TOST for AUC and for Cmax when we didnt mention anything about widened limits (ABEL) in the protocol (hypothetical scenario)?

Yes because the study was intended to be assessed by ABE.

❝ And on the other hand, for Cmax we should calculate it with power.scABEL when we mentioned widened limits (ABEL) in the protocol?

Correct again.

❝ 2. Another hypothetical scenario: If we get information from the literature about a drug's CVw (Cmax) of around 30% (lets say a range of 25-35%) and if we get a drug's CVw of 22% from our pilot study, can we then do a regular study with design 2x3x3 (in case we get CVw for Cmax 35% so then we could widen the limits and use ABEL)?

Yes, if you state it in the protocol. In general I would trust a – reasonable large! – pilot study more than the literature. The reference may have changed in the meantime, different sampling, bioanalytical method, etc.
BTW, why do you want to use a partial replicate design and not one of the 2-sequence 3-period full replicate designs (TRT|RTR or TRR|RTT)? Acceptable for the EMA (see the Q&A document and this post for examples). Same degrees of freedom and similar sample sizes. More informative because you can also estimate CV_wT, which is useful in designing other studies (quite often CV_wT < CV_wR and you need a smaller sample size). In the partial replicate you have to assume CV_wT = CV_wR, which is often wrong.

❝ And if then our drug's CVw from this regular study is lets say 21%, can agencies ask us to justify replicate 2x3x3 design as if why didnt we use conventional 2x2x2 design if we got CVw=22% in our pilot study?

By ‘regular study’ are you meaning a simple crossover? Even if you observed CV_w ≥30% I would be cautious. That’s only a hint of a highly variable reference. See this article for details.
There is nothing to justify. Any study in a replicate design can also be assessed for ABE. My – former – best enemy once said »From a purely statistical perspective, all studies should be performed in a replicate design.« One of the rare occasions I agree with him.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

BEQool
★

2024-02-04 20:24
(520 d 01:14 ago)

@ Helmut
Posting: # 23853
Views: 4,959

ABEL vs. ABE

Post reply

Sorry for my late response.

❝ Oh dear! EMA region, really‽ Outright bizarre.

No no, EMA has not asked this question (yet :-D

). I just wanted to make a hypothetical example and mentioned EMA in order to use ABEL and not RSABE (it is probably not even relevant but anyway).

❝ BTW, why do you want to use a partial replicate design and not one of the 2-sequence 3-period full replicate designs (TRT|RTR or TRR|RTT)? Acceptable for the EMA (see the Q&A document and this post for examples). Same degrees of freedom and similar sample sizes. More informative because you can also estimate CV_wT, which is useful in designing other studies (quite often CV_wT < CV_wR and you need a smaller sample size). In the partial replicate you have to assume CV_wT = CV_wR, which is often wrong.

Yes it could also be any replicate design (2-sequence 3-period full replicate design or 2-sequence 4-period full replicate designs), 3-sequence 3-period partial replicate design here was just an example.
Nevertheless, a deficiency letter regarding 2-sequence 3-period full replicate design from an European agency was already received (it was answered successfully) so 3-sequence 3-period partial replicate design is now used sometimes instead.

❝ By ‘regular study’ are you meaning a simple crossover? Even if you observed CV_w ≥30% I would be cautious. That’s only a hint of a highly variable reference. See this article for details.

My bad, I meant a pivotal study. I wanted to ask if agencies can ask you to justify using any replicate design when the drug's CVw in the study is for example around 20% - but you already answered that below.

❝ There is nothing to justify. Any study in a replicate design can also be assessed for ABE. My – former – best enemy once said »From a purely statistical perspective, all studies should be performed in a replicate design.« One of the rare occasions I agree with him.

Thank you for the answers, everything is clear now.

BEQool

Helmut
★★★

Vienna, Austria,
2024-02-04 22:12
(519 d 23:26 ago)

@ BEQool
Posting: # 23854
Views: 4,939

Being able to read does not hurt…

Post reply

HI BEQool,

❝ Sorry for my late response.

No worries.

❝ […] a deficiency letter regarding 2-sequence 3-period full replicate design from an European agency was already received (it was answered successfully) so 3-sequence 3-period partial replicate design is now used sometimes instead.

Fantastic! The European agency (which one?) should read – and accept – what is stated in the EMA’s Q&A-document.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

BEQool ★ 2024-02-05 14:20 (519 d 07:18 ago) @ Helmut Posting: # 23855 Views: 4,868	Being able to read does not hurt… Post reply
	❝ Fantastic! The European agency (which one?) should read – and accept – what is stated in the EMA’s Q&A-document. France and Ireland - however, it was for veterinary product

Helmut ★★★ Vienna, Austria, 2024-02-05 17:01 (519 d 04:37 ago) @ BEQool Posting: # 23856 Views: 4,827	Being able to read does not hurt… Post reply
	Hi BEQool, ❝ France and Ireland - however, it was for veterinary product This might explain it. The last time I read the vet-GL it was crap (politely speaking). — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes

mittyri
★★

Russia,
2024-02-05 17:49
(519 d 03:49 ago)

@ Helmut
Posting: # 23857
Views: 4,903

Vet BE Guideline

Post reply

Hi Helmut and BEQool!

❝ ❝ France and Ireland - however, it was for veterinary product

❝ This might explain it. The last time I read the vet-GL it was crap (politely speaking).

That's interesting how CHMP EMA Guidelines invade CVMP Guidelines. I am trying to catch the logic:

For substances with highly variable disposition where it is difficult to show bioequivalence due to high intra-individual variability, different alternative designs have been suggested in the literature (e.g. replicate study design). A replicate cross-over study design using 3 periods (partial replication where only the reference product is replicated in all animals) or 4 periods (full replication, where each subject receives the test and reference products twice) can be carried out.
So CVMP people have some concerns about other designs we don't know yet. :-D

—
Kind regards,
Mittyri