Are simulations sufficient? …lenghty post! [Two-Stage / GS Designs]
Detlew has already pointed to this thread. See especially footnote #3 in this post.
You are posting from The Netherlands… Shall I feel guilty recommending Method C since 2007?

“Confidence intervals were adapted based upon the power of the pharmacokinetic variable. In this case for Cmax the power was below 80% and confidence intervals were adapted to 94.12%, instead of the usually applied 90%. However, adapting the confidence intervals based upon power is not acceptable and also not in accordance with the EMA guideline. Confidence intervals should be selected a priori, without evaluation of the power.”
My emphasis. BTW, the study passed with the 94.12% CI.Potvin et al. started from Pocock’s α 0.0294 and showed in their simulations that the empiric α never exceeded 0.051 (i.e., maximum observed inflation 2%) in Methods B/C (or with α 0.028 in Method D). They considered a potential 4% increase of risk type I (i.e., to 0.052) as negligible beforehand. Note that in one million simulations only αemp. exceeding 0.05036 is significantly >0.05 (by the exact binominal test). One can never show in simulations αemp. ≤0.05), e.g., the upper critical value for 109 simulations is still 0.05001134.

Potvin et al. were somewhat unfortunate in presenting their findings in Table I. They formated empiric alphas larger than 0.052 in italics and ones significantly larger than 0.05036 (but still with the predefined acceptance boundary) in bold. So a quick look gives a false impression about what they considered important (>0.052) and what not (>0.05036). IMHO the table should have been formatted like this one (only the top block; no values >0.05036 below):
n1 CV A B C D
12 10 0.0297 0.0496 0.0498
24 10 0.0294 0.0500 0.0500
36 10 0.0294 0.0500 0.0504
48 10 0.0292 0.0501 0.0502
60 10 0.0294 0.0504 0.0501
12 20 0.0584 0.0463 0.0510 0.0499
24 20 0.0505 0.0320 0.0490 0.0493
36 20 0.0497 0.0294 0.0499 0.0499
48 20 0.0500 0.0292 0.0495 0.0497
60 20 0.0500 0.0297 0.0500 0.0500
12 30 0.0575 0.0437 0.0441 0.0415
24 30 0.0550 0.0475 0.0492 0.0475
36 30 0.0523 0.0397 0.0477 0.0471
48 30 0.0502 0.0324 0.0494 0.0495
60 30 0.0498 0.0296 0.0502 0.0499
It’s also clear from the table that in some scenarios αemp. was substantially below 0.05 – indicating that 0.0294 was lower than necessary. Of course no problems with risk I, but the penalty one has to pay in terms of the sample size is too high. See for example αemp. for n1 12, CVintra 10%: Method C 0.0496, D 0.0498, but B 0.0297… In other words, if you opt for Method B in this scenario you could increase αadj. and still maintain αemp. ≤0.05. For αadj. 0.045 (!), Method B, 106 simulations I got αemp. 0.04501 and 1–βemp. 98.69%. In this case (only ~1% of studies went to stage 2), the penalty in Method B is too high. But see Potvin’s discussion:
“This study did not seek to find the best possible two-stage design, but rather to find good ones that could be used by sponsors without further validation.”
Not sure what will happen if you come up with a 1–2αadj. = 91% confidence interval (though it should be acceptable according to the GL “[…] (with the confidence intervals accordingly using an adjusted coverage probability which will be higher than 90%)”. “Coverage probability”?1 Some Crypto-Bayesians2,3,4 at the EMA?Some European authorities seem to accept Method B (they consider Method C as problematic) but want to see simulations – at least if your planned n1 and anticipated CVintra is not given in the tables. Note that Potvin et al. have simulated the CV-range of 20–30% in 1% increments (not given in the tables, but claimed in the discussion section); αemp. did not exceed 0.051 in any case.
So I think these are the options:
- It seems that Method C is not acceptable to some European authorities. Not sure whether I should even list it as an option here; have to adjust my future presentations on the topic.
Note: Method C is explicitly preferred by the FDA and in Canada.
- If n1 / CVintra is exactly covered in the papers (80% power; θ 0.95: Potvin, θ 0.90: Montague) go with Method B (Potvin) or D (Montague). Even then I suggest to perform a posteriori simulations covering your actual n1 and CVintra. Recently I was asked for simulations where the study almost matched the tables (n1 49 and CVintra 30.65%; tables give n1 48, CVintra 30%). Guess the outcome…
- If you are unsure, go for a scientific advice. Most likely you would end up with a statement similar to MEB’s and suggesting the “rumour method” stated by Detlew. Might [sic] work, but IMHO here simulations are mandatory because you are boldly going “Where No One Has Gone Before”.
- If you want to leave the framework of the PQRI5 like in you original post (90% power) – as ElMaestro already pointed out – simulations are mandatory. III. above also applies.
- If you are tempted to go with a full adaptive design6 (i.e., re-estimate the sample size for stage 2 not by a fixed θ but the one observed in stage 1), beware! In my simulations (well, another method…) I saw extremely skewed distributions of sample sizes. This is not surprising because we have to deal with two random variables now. If by chance the observed θ is far away from 1 together with a high CV the penalty might be substantial. For Potvin’s Example 2 the second stage would need 12 subjects instead of 8.
I would follow this concept only if the first stage is sufficiently large in order to get a reliable estimate of θ. No space for gambling here.
BTW, interesting topic!
- WP tells me: “The coverage probability of a confidence interval is the proportion of the time that the interval contains the true value of interest.” I see!
- Selwyn MR, Dempster AP, Hall NR. A Bayesian approach to bioequivalence for the 2 × 2 changeover design. Biometrics. 1981;37(1):11–21.
- Fluehler H, Grieve AP, Mandallaz D, Mau J, Moser HA. Bayesian Approach to Bioequivalence Assessment: An Example. J Pharm Sci. 1983;10:1178–81. doi:10.1002/jps.2600721018
- Selwyn MR and NR Hall NR. On Bayesian Methods for Bioequivalence. Biometrics. 1984;40(4):1103–8.
- Product Quality Research Institute, a non-profit organisation established in 1999. Members amongst others are the US Food and Drug Administration / Center for Drug Evaluation and Research (FDA/CDER), Health Canada, the US Pharmacopeial Convention (USP), the American Association of Pharmaceutical Scientists (AAPS), the Consumer Healthcare Products Association (CHPA), and the Pharmaceutical Research and Manufacturers of America (PhRMA). PQRI set up a working group on sequential designs in 2003 and supported the simulations.
- Fuglsang A. Controlling type I errors for two-stage bioequivalence study designs. Clin Res Reg Aff. 2011;28(4):100–5. doi:10.3109/10601333.2011.631547
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
![[image]](https://static.bebac.at/pics/Blue_and_yellow_ribbon_UA.png)
Helmut Schütz
![[image]](https://static.bebac.at/img/CC by.png)
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- two-stage design power 90% second stage Yvonne 2012-08-02 09:50
- two-stage design power 90% second stage ElMaestro 2012-08-02 10:17
- two-stage design power 90% second stage Yvonne 2012-08-02 11:47
- two-stage design power 90% second stage ElMaestro 2012-08-02 11:54
- two-stage design power 90% second stage Yvonne 2012-08-02 11:47
- two-stage design power 90% in sample size adaption d_labes 2012-08-02 10:33
- two-stage design power 90% in sample size adaption ElMaestro 2012-08-02 11:29
- two-stage design power 90% in sample size adaption Yvonne 2012-08-02 12:06
- two-stage design power 90% in sample size adaption ElMaestro 2012-08-02 16:59
- two-stage design power 90% in sample size adaption Yvonne 2012-08-02 12:06
- two-stage design power 90% in sample size adaption ElMaestro 2012-08-02 11:29
- Are simulations sufficient? …lenghty post!Helmut 2012-08-02 17:39
- adjusted alpha = 0.045 d_labes 2012-08-03 11:33
- adjusted alpha = 0.038 Helmut 2012-08-03 14:26
- adjusted alpha by sims? d_labes 2012-08-03 15:48
- Yeah but, no but, yeah but, no but… Helmut 2012-08-03 16:39
- Yeah but, no but ... d_labes 2012-08-07 14:42
- Yeah but, no but ... Helmut 2012-08-07 15:14
- Yeah but, no but ... d_labes 2012-08-07 14:42
- Yeah but, no but, yeah but, no but… Helmut 2012-08-03 16:39
- adjusted alpha by sims? d_labes 2012-08-03 15:48
- adjusted alpha = 0.038 Helmut 2012-08-03 14:26
- adjusted alpha = 0.045 d_labes 2012-08-03 11:33
- two-stage design power 90% second stage ElMaestro 2012-08-02 10:17