Oh boy! (lengthy post) [Power / Sample Size]
you deserve my sincerest sympathy. IMHO, it is almost futile to deal with people showing a combination of ignorance and resistance against advice.
Below sumfink copypasted from my standard response (that’s the most basic level I managed to go) – which “works” in most cases. If anybody has a more comprehensible explanation – suggestions are welcome.
Alternatively tell the CRO to go to hell.
══════════════════════════════╬══════════════════════════════
In sample size estimation we have these four variables (order as given in many guidelines):- Deviation of the test from reference (∆ – generally expressed as the T/R-ratio; aka GMR),
- Variability (in replicate designs CVWR [+CVWT in fully replicated ones], within-subject CV in cross-over designs, or total CV in parallel designs),
- Patient’s risk (α, probability of type I error; aka the patient’s risk)
- Desired or target power (1 – β, where β is the probability of type II error; aka the producer’s risk).
#3 is fixed by regulatory authorities. Generally α is set to 0.05, leading to the 1–2α = 90% confidence interval.
#1 is an assumption! Only rarely (e.g., if one has an established in-vitro in-vivo correlation) one can predict from measured content of the test- and reference-batches and their dissolution what the products’ ratio will be in vivo. For most drugs the batch-release specification is ±10% of declared content. Even if one measures a potency of 100% for both formulations, one must not forget the variability (especially the inaccuracy) of the analytical method. It might be that the ‘true’ content of the test is 97.28% and the one of the reference is 102.40%. Even if the drug is 100% bioavailable, one will observe in vivo a GMR of 97.28/102.40=95.00%. This is one of the reasons in sample size estimation one should never ever assume a GMR of 1. Doing so would be simply negligent.
#2 is an assumption! One gets estimates (!) from the literature, previous (even failed) studies, or pilot studies. Any CV is an estimate, not “carved in stone”. Its precision depends on the sample size of the study and (to a minor extend) on the number of sequences (→ degrees of freedom). Simple example: A CV of 20% in a pilot study in 12 subjects has a different precision if it was observed in a 2×2 cross-over (two formulations) or in a 6×3 (Williams’ design with three formulations)…
#4 can be chosen by the sponsor. 80–90% is recommended in many guidances/guidlines. However, power <70% is close to gambling in a casino. ICH E9 (Statistical Principles for Clinical Trials) tells us:
The number of subjects in a clinical trial should always be large enough
to provide a reliable answer to the questions addressed.
Planing for too much power (i.e., the company has a lot of money and doesn’t want to fail) is ethically problematic. It is the job of the independent ethics committee to care about the welfare of subjects in clinical studies. Theoretically the IEC should reject studies with too low (high chance to fail) as well as with too high power (aka “forced bioequivalence”).
══════════════════════════════╬══════════════════════════════
❝ […] As a standard, I would like them to perform the calculations with a GMR of 0.95 (if no other information on the in vivo performance is available and dissolution shows no apparent difference).
Yep, makes sense. See my example about the precision of measured potency above. Note that for NTIDs the FDA requires tighter specs for batch release (±5% instead of ±10%).1
❝ […] a reference (to an article or guideline, preferably EMA)…
- WHO
TRS 937, Annex 7 (2006): “the mean deviation from the reference product compatible with bioequivalence […]”.
Multisource (Generic) Products (Draft 2014): “the mean deviation from the comparator product compatible with bioequivalence[…]”.
- Health Canada suggested a ratio of 1 in their 1990s guidances and abandoned it in 2012 in favor of “[…] the expected mean difference between the test and reference formulations […]”.
- The FDA in Appendix C to their 2001 biostat. guidance gives only (!) a ∆ of 0.05. FDA in their recent ANDA-draft states just “The total number of subjects in a study should be sufficient to provide adequate statistical power for BE demonstration […]”.
- Unfortunately EMA is also of no help (the detailed information in the 2001 NfG shrank to “an appropriate sample size calculation”).
- With two out of the four variables in the equation being estimates we cannot calculate something, only estimate it (aka rubbish in, rubbish out).
- OK, we plug in some numbers. However, we cannot directly calculate the sample size, only power. It’s an iterative procedure to find the smallest sample size where power ≥ target.
BTW, the two Lászlós in their paper2 about sample size estimation of HVDs/HVDPs for reference-scaling recommend a larger deviation of the GMR. Quoting the discussion section:
Designing BE studies for highly variable drugs
Sample sizes for designing BE studies which involve non-highly variable drugs are typically estimated by assuming a within-subject (or a residual) variation and using a sample-size table such as that of Hauschke et al. The sample size is usually selected at a 5% deviation between the means, i.e. at a true GMR = 1.05.
Larger absolute differences between the two logarithmic means can be noted in the various BE studies when the within-subject variation is higher. Therefore, it is recommended that a 10% deviation between the means, i.e. a true GMR = 1.10, be considered […].
❝ … or a solid justification on especially the GMR to take into account when planning a BE study...
See above. Assuming a GMR of 1 and adding some subjects on top is solid crap.
❝
Agree. BTW, did you notice this post?
- Sample sizes would be prohibitively large for conventional specs. For a CV of 7% (AUC of valproic acid), a GMR 0.95, and target power 90% one would need a sample size of 128 subjects in a fully replicated 4-period design… Tight spec’s would allow to assume a GMR of 0.975 – requiring just 28 subjects.
- Tóthfalusi L, Endrényi L. Sample Sizes for Designing Bioequivalence Studies for Highly Variable Drugs. J Pharm Pharmaceut Sci. 2012;15(1):73–84. Open access.
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- Justification for GMR=0.95 in planning Oiinkie 2014-07-01 15:02 [Power / Sample Size]
- Justification for GMR=0.95 in planning ElMaestro 2014-07-01 15:19
- Oh boy! (lengthy post)Helmut 2014-07-01 16:41