Probability to pass multiple studies ? [Power / Sample Size]
sorry for excavating an old story.
❝ ❝ say, we have \(\small{n}\) studies, each powered at 90%. What is the probability (i.e., power) that all of them pass BE?
❝ ❝ Let’s keep it simple: T/R-ratios and CVs are identical in studies \(\small{1\ldots n}\). Hence, \(\small{p_{\,1}=\ldots=p_{\,n}}\). If the outcomes of studies are independent, is \(\small{p_{\,\text{pass all}}=\prod_{i=1}^{i=n}p_{\,i}}\), e.g., for \(\small{p_{\,1}=\ldots=p_{\,6}=0.90\rightarrow 0.90^6\approx0.53}\)?
❝ ❝ Or does each study stand on its own and we don’t have to care?
❝
❝ Yes to 0.53.
❝ The risk is up to you or your client. I think there is no general awareness, …
❝ "Have to care" really involves the fine print. I think in the absence of further info it is difficult to tell if you should care and/or from which perspective care is necessary.
I’m pretty sure that we were wrong:
We want to demonstrate BE in all studies. Otherwise, the product would not get an approval (based on multiple studies in the dossier). That means, we have an ‘AND-composition’. Hence, the Intersection-Union Test (IUT) principle applies1,2 and each study stands indeed on its own. Therefore, any kind of ‘power adjustment’ I mused about before is not necessary.
In my example above one would have to power each of the studies to \(\small{\sqrt[6]{0.90}=98.26\%}\) to achieve ≥ 90% overall power. I cannot imagine that this was ever done.
Detlew and I have some empiric evidence. The largest number of confirmatory studies in a dossier I have seen so far was 12, powered to 80–90% (there were more in the package but only exploratory like comparing types of food, sprinkle studies, ). If overall power in multiple studies would have been really that low (say, \(\small{0.85^{12}\approx14\%}\)), I should have seen many more failures – which I didn’t.
❝ … but my real worry is the type I error, as I have indicated elsewhere.
We discussed that above.
Agencies accept repeating an inconclusive3,4 study in a larger sample size. I agree with your alter ego5 that such an approach may inflate the Type I Error indeed. I guess regulators trust more in the repeated study believing [sic] that its outcome is more ‘reliable’ due to the larger sample size. But that’s – apart from the inflated Type I Error – a fallacy.
- Berger RL, Hsu JC. Bioequivalence Trials, Intersection-Union Tests and Equivalence Confidence Sets. Stat Sci. 1996; 11(4): 283–302. JSTOR:2246021. free resource.
- Wellek S. Testing statistical hypotheses of equivalence. Boca Raton: Chapman & Hall/CRC; 2010. Chapter 7. p. 161–176.
- If at least one of the confidence limits lies outside of the acceptance limits. That is disctinct from a bioinequivalent study, where the confidence interval lies entirely outside the acceptance limits, i.e., the Null hypothesis is not rejected. That calls for a reformulation and starting over from scratch.
- García-Arieta A. The failure to show bioequivalence is not evidence against generics. Br J Clin Pharmacol. 2010; 70(3): 452–3. doi:10.1111/j.1365-2125.2010.03684.x. Open access.
- Fuglsang A. Pilot and Repeat Trials as Development Tools Associated with Demonstration of Bioequivalence. AAPS J. 2015; 17(3): 678–83. doi:10.1208/s12248-015-9744-6. Free Full text.
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- Probability to pass multiple studies Helmut 2021-02-19 12:02 [Power / Sample Size]
- Probability to pass multiple studies ElMaestro 2021-02-19 12:57
- Power limbo Helmut 2021-02-19 13:37
- Probability to pass multiple studies ?Helmut 2022-06-24 14:03
- Probability to pass multiple studies ElMaestro 2021-02-19 12:57