Bioequivalence and Bioavailability Forum • Probability to pass multiple studies

Helmut
★★★

Vienna, Austria,
2021-02-19 13:02
(1599 d 07:24 ago)

Posting: # 22214
Views: 5,763

Probability to pass multiple studies [Power / Sample Size]

Dear all (and esp. our expert in probability zizou),

say, we have $\small{n}$ studies, each powered at 90%. What is the probability (i.e., power) that all of them pass BE?

Let’s keep it simple: T/R-ratios and CVs are identical in studies $\small{1\ldots n}$. Hence, $\small{p_{\,1}=\ldots=p_{\,n}}$. If the outcomes of studies are independent, is $\small{p_{\,\text{pass all}}=\prod_{i=1}^{i=n}p_{\,i}}$, e.g., for $\small{p_{\,1}=\ldots=p_{\,6}=0.90\rightarrow 0.90^6\approx0.53}$?
Or does each study stand on its own and we don’t have to care? :confused:

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2021-02-19 13:57
(1599 d 06:30 ago)

@ Helmut
Posting: # 22215
Views: 4,782

Probability to pass multiple studies

Post reply

Hi Helmut,

❝ say, we have $\small{n}$ studies, each powered at 90%. What is the probability (i.e., power) that all of them pass BE?

❝

❝ Let’s keep it simple: T/R-ratios and CVs are identical in studies $\small{1\ldots n}$. Hence, $\small{p_{\,1}=\ldots=p_{\,n}}$. If the outcomes of studies are independent, is $\small{p_{\,\text{pass all}}=\prod_{i=1}^{i=n}p_{\,i}}$, e.g., for $\small{p_{\,1}=\ldots=p_{\,6}=0.90\rightarrow 0.90^6\approx0.53}$?

❝ Or does each study stand on its own and we don’t have to care? :confused:

Yes to 0.53.
The risk is up to you or your client. I think there is no general awareness, but my real worry is the type I error, as I have indicated elsewhere.
"Have to care" really involves the fine print. I think in the absence of further info it is difficult to tell if you should care and/or from which perspective care is necessary.

Related issue, the one that worries me more:
You test one formulation, it fails on the 90% CI, you develop a new formulation, it passes on the 90% CI. What is the type I error? Well, strictly speaking that would be inflated. But noone seems to give a damn. :-D

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2021-02-19 14:37
(1599 d 05:49 ago)

@ ElMaestro
Posting: # 22216
Views: 4,783

Power limbo

Post reply

Hi ElMaestro,

❝ Yes to 0.53.

Shit, expected that.

❝ The risk is up to you or your client.

Such a case is not uncommon. Say, you have single dose studies (highest strength, fasting/fed), multiple dose studies with two strengths (fasting/fed). Then you get the n = 6 of my example.
If you want a certain overall power, then each study has to be powered to $p_i=\sqrt[n]{p_\textrm{overall}}$.
With n = 6, for 80% → 96.35% and for 90% → 98.26%. Will an IEC accept that?*

❝ I think there is no general awareness, but my real worry is the type I error, as I have indicated elsewhere.

I know.

❝ Related issue, the one that worries me more:

❝ You test one formulation, it fails on the 90% CI, you develop a new formulation, it passes on the 90% CI. What is the type I error?

I’m not concerned about a new formulation. The first one went to the waste bin → zero consumer risk. The study supporting the new formulation stands on its own. Hence, TIE ≤0.05.

❝ Well, strictly speaking that would be inflated. But noone seems to give a damn. :-D

I’m indeed concerned about repeating the study (same formulation) with more subjects. Then you get an inflated TIE for sure. Regulators don’t care. They trust in the second study more because it is larger and thus the result ‘more reliable’, I guess.

We have that everywhere. A value in the post-study exam is clinically significant out of range. Follow-up initiated and now all is good. Did the value really improve? There is always inaccuracy involved. Maybe the first one was correct and the second one not.

My doctor gave me six months to live,
but when I couldn’t pay the bill
he gave me six months more. Walter Matthau

The IEC assesses protocols of single studies. If we want to keep the desired overall power, sample sizes will be substantially higher:

library(PowerTOST) CV <- seq(0.2, 0.4, 0.1) ns <- 1:6L target <- c(0.8, 0.9) theta0 <- 0.95 design <- "2x2x2" res <- data.frame(CV = rep(CV, each = length(ns)), target = rep(target, each = length(CV)*length(ns)), n.single = NA, pwr.single = NA, studies = ns) res$target.adj <- res$target^(1/res$studies) for (j in 1:nrow(res)) { res[j, 3:4] <- sampleN.TOST(CV = res$CV[j], theta0 = theta0, design = design, targetpower = res$target[j], print = FALSE)[7:8] tmp <- sampleN.TOST(CV = res$CV[j], theta0 = theta0, design = design, targetpower = res$target.adj[j], print = FALSE) res$n[j] <- tmp[["Sample size"]] res$pwr.each[j] <- tmp[["Achieved power"]]^res$studies[j] } res$n.incr <- sprintf("%+.1f%%", 100*(res$n - res$n.single) / res$n) res[, 1:8] <- signif(res[, 1:8], 5) print(res, row.names = FALSE) CV target n.single pwr.single studies target.adj n pwr.each n.incr 0.2 0.8 20 0.83468 1 0.80000 20 0.83468 +0.0% 0.2 0.8 20 0.83468 2 0.89443 24 0.80286 +16.7% 0.2 0.8 20 0.83468 3 0.92832 28 0.81705 +28.6% 0.2 0.8 20 0.83468 4 0.94574 30 0.80973 +33.3% 0.2 0.8 20 0.83468 5 0.95635 32 0.81341 +37.5% 0.2 0.8 20 0.83468 6 0.96349 34 0.82384 +41.2% 0.3 0.8 40 0.81585 1 0.80000 40 0.81585 +0.0% 0.3 0.8 40 0.81585 2 0.89443 52 0.81354 +23.1% 0.3 0.8 40 0.81585 3 0.92832 58 0.80091 +31.0% 0.3 0.8 40 0.81585 4 0.94574 64 0.80870 +37.5% 0.3 0.8 40 0.81585 5 0.95635 68 0.80855 +41.2% 0.3 0.8 40 0.81585 6 0.96349 72 0.81548 +44.4% 0.4 0.8 66 0.80525 1 0.80000 66 0.80525 +0.0% 0.4 0.8 66 0.80525 2 0.89443 88 0.81075 +25.0% 0.4 0.8 66 0.80525 3 0.92832 100 0.80737 +34.0% 0.4 0.8 66 0.80525 4 0.94574 108 0.80202 +38.9% 0.4 0.8 66 0.80525 5 0.95635 116 0.80809 +43.1% 0.4 0.8 66 0.80525 6 0.96349 122 0.81015 +45.9% 0.2 0.9 26 0.91763 1 0.90000 26 0.91763 +0.0% 0.2 0.9 26 0.91763 2 0.94868 32 0.92071 +18.8% 0.2 0.9 26 0.91763 3 0.96549 34 0.90766 +23.5% 0.2 0.9 26 0.91763 4 0.97400 36 0.90405 +27.8% 0.2 0.9 26 0.91763 5 0.97915 38 0.90639 +31.6% 0.2 0.9 26 0.91763 6 0.98259 40 0.91230 +35.0% 0.3 0.9 52 0.90197 1 0.90000 52 0.90197 +0.0% 0.3 0.9 52 0.90197 2 0.94868 66 0.90937 +21.2% 0.3 0.9 52 0.90197 3 0.96549 72 0.90304 +27.8% 0.3 0.9 52 0.90197 4 0.97400 78 0.90750 +33.3% 0.3 0.9 52 0.90197 5 0.97915 82 0.90779 +36.6% 0.3 0.9 52 0.90197 6 0.98259 84 0.90158 +38.1% 0.4 0.9 88 0.90041 1 0.90000 88 0.90041 +0.0% 0.4 0.9 88 0.90041 2 0.94868 110 0.90173 +20.0% 0.4 0.9 88 0.90041 3 0.96549 122 0.90008 +27.9% 0.4 0.9 88 0.90041 4 0.97400 132 0.90364 +33.3% 0.4 0.9 88 0.90041 5 0.97915 138 0.90121 +36.2% 0.4 0.9 88 0.90041 6 0.98259 144 0.90267 +38.9% n.single Number of subjects for given CV and target power in a single study
pwr.single Achieved power with n.single subjects in a single study
target.adj Adjusted target power for sample size estimation of multiple studies
n Number of subjects for given CV and adjusted power in each of the studies
pwr.each Overall power of the studies
n.incr Penalty in sample size compared to a single study

Example:
We assume CV 20%, T/R-ratio 0.95, and desire a power of ≥80%. Then in a single study with 20 subjects we would achieve a power of ~83.5%. If we want to perform six studies, the overall power would drop to $\small{0.80^6\approx 26.2\%}$ (total sample size 120). In order to pass all studies with ≥80% power, we have to adjust the target power to $\small{\sqrt[6]{0.80}=0.96349}$. Now we need 34 subjects in each of the studies (204 in total).
Question: Would a sponsor accept an extremely high risk of failure (73.8%) or prefer a sample size penalty of 70%?

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2022-06-24 16:03
(1109 d 05:24 ago)

@ ElMaestro
Posting: # 23080
Views: 3,441

Probability to pass multiple studies ?

Post reply

Hi ElMaestro and all,

sorry for excavating an old story.

❝ ❝ say, we have $\small{n}$ studies, each powered at 90%. What is the probability (i.e., power) that all of them pass BE?

❝ ❝ Let’s keep it simple: T/R-ratios and CVs are identical in studies $\small{1\ldots n}$. Hence, $\small{p_{\,1}=\ldots=p_{\,n}}$. If the outcomes of studies are independent, is $\small{p_{\,\text{pass all}}=\prod_{i=1}^{i=n}p_{\,i}}$, e.g., for $\small{p_{\,1}=\ldots=p_{\,6}=0.90\rightarrow 0.90^6\approx0.53}$?

❝ ❝ Or does each study stand on its own and we don’t have to care? :confused:

❝

❝ Yes to 0.53.

❝ The risk is up to you or your client. I think there is no general awareness, …

❝ "Have to care" really involves the fine print. I think in the absence of further info it is difficult to tell if you should care and/or from which perspective care is necessary.

I’m pretty sure that we were wrong:

We want to demonstrate BE in all studies. Otherwise, the product would not get an approval (based on multiple studies in the dossier). That means, we have an ‘AND-composition’. Hence, the Intersection-Union Test (IUT) principle applies^1,2 and each study stands indeed on its own. Therefore, any kind of ‘power adjustment’ I mused about before is not necessary.

In my example above one would have to power each of the studies to $\small{\sqrt[6]{0.90}=98.26\%}$ to achieve ≥ 90% overall power. I cannot imagine that this was ever done.
Detlew and I have some empiric evidence. The largest number of confirmatory studies in a dossier I have seen so far was 12, powered to 80–90% (there were more in the package but only exploratory like comparing types of food, sprinkle studies, :blahblah:

). If overall power in multiple studies would have been really that low (say, $\small{0.85^{12}\approx14\%}$), I should have seen many more failures – which I didn’t.

❝ … but my real worry is the type I error, as I have indicated elsewhere.

We discussed that above.
Agencies accept repeating an inconclusive^3,4 study in a larger sample size. I agree with your alter ego⁵ that such an approach may inflate the Type I Error indeed. I guess regulators trust more in the repeated study believing [sic] that its outcome is more ‘reliable’ due to the larger sample size. But that’s – apart from the inflated Type I Error – a fallacy.

Berger RL, Hsu JC. Bioequivalence Trials, Intersection-Union Tests and Equivalence Confidence Sets. Stat Sci. 1996; 11(4): 283–302. JSTOR:2246021. free resource.
Wellek S. Testing statistical hypotheses of equivalence. Boca Raton: Chapman & Hall/CRC; 2010. Chapter 7. p. 161–176.
If at least one of the confidence limits lies outside of the acceptance limits. That is disctinct from a bioinequivalent study, where the confidence interval lies entirely outside the acceptance limits, i.e., the Null hypothesis is not rejected. That calls for a reformulation and starting over from scratch.
García-Arieta A. The failure to show bioequivalence is not evidence against generics. Br J Clin Pharmacol. 2010; 70(3): 452–3. doi:10.1111/j.1365-2125.2010.03684.x. Open access.
Fuglsang A. Pilot and Repeat Trials as Development Tools Associated with Demonstration of Bioequivalence. AAPS J. 2015; 17(3): 678–83. doi:10.1208/s12248-015-9744-6. Free Full text.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes