Advanced example (futility for GMR) [Two-Stage / GS Designs]
Dear all,
I want to share with you a recent example. Endogenous drug, baseline with circadian rhythm. Borderline highly variable, both for Cmax and AUC. Previous studies contradictory; CV reported with ~20–40%. Since for an European submission scaling AUC is not possible and the CV was unclear we considered various TSDs (“type 1”: T/R 0.95, target power 80% with optimized adjusted α for n1 16–66, CV 15–60%). We were also exploring the impact of various futility rules. Left panels empiric Type I error, right panels % power (lower surface in the first stage and upper one overall).
TIEmax-values are 0.05003 (I), 0.04995 (II), 0.05000 (III), and 0.05005 (IV).
TIE-surfaces show the shape common to “type 1” TSDs. With futility rules the methods become increasingly conservative at the extreme combinations of n1/CV.
Trivial observation: Increasingly restrictive futility rules prevent more studies from proceeding to the second stage. Therefore, we need less adjustment of α.
Hey, wider CIs! Are we more likely too pass and/or pay a smaller sample size penalty? On the contrary. Anders1 already showed last year that futility rules on the total sample size may substantially deteriorate power. For an example of the full adaptive methods proposed by Karalis/Macheras see the recent review.2 We get a similar effect for the GMR here. In the right panels white lines show the intersection with the plane of 80% power. With increasingly restrictive futility rules only small changes in the first stage’s power but overall the surface is lower and tilted down.

I want to share with you a recent example. Endogenous drug, baseline with circadian rhythm. Borderline highly variable, both for Cmax and AUC. Previous studies contradictory; CV reported with ~20–40%. Since for an European submission scaling AUC is not possible and the CV was unclear we considered various TSDs (“type 1”: T/R 0.95, target power 80% with optimized adjusted α for n1 16–66, CV 15–60%). We were also exploring the impact of various futility rules. Left panels empiric Type I error, right panels % power (lower surface in the first stage and upper one overall).
I. α 0.0302 (93.96% CI), no futility criterion
II. α 0.0306 (93.88% CI), futility criterion ]0.8000–1.2500[
III. α 0.0313 (93.74% CI), futility criterion ]0.8250–1.2121[
IV. α 0.0327 (93.46% CI), futility criterion ]0.8500–1.1765[
TIEmax-values are 0.05003 (I), 0.04995 (II), 0.05000 (III), and 0.05005 (IV).
TIE-surfaces show the shape common to “type 1” TSDs. With futility rules the methods become increasingly conservative at the extreme combinations of n1/CV.
Trivial observation: Increasingly restrictive futility rules prevent more studies from proceeding to the second stage. Therefore, we need less adjustment of α.
Hey, wider CIs! Are we more likely too pass and/or pay a smaller sample size penalty? On the contrary. Anders1 already showed last year that futility rules on the total sample size may substantially deteriorate power. For an example of the full adaptive methods proposed by Karalis/Macheras see the recent review.2 We get a similar effect for the GMR here. In the right panels white lines show the intersection with the plane of 80% power. With increasingly restrictive futility rules only small changes in the first stage’s power but overall the surface is lower and tilted down.
- Without a futility rule overall power is generally ≥80% – unless the first stage was small and one will be hit by a high CV. The 80%-border runs almost linear from n1 16 / CV 33% to n1 46 / CV 60%. However, even for n1 16 and CV 60% power is still 75.2%. For such a stupidly low n1 one deserves it to be punished by an average total sample size of 159… Don’t go there.
- A futility of ]0.8–1.25[ (as proposed by Peter Armitage) is still suitable – if the sample size is not too small for the “best guess” CV.
- With ]0.8250–1.2121[ we enter a slippery field. For moderate sample sizes it may be difficult to maintain the desired power.
- A futility of ]0.8500–1.1765[ (proposed by Charles Bon) makes only sense if one is willing to start in a relatively large sample size and is confident that the CV will not be too high…

- Fuglsang A. Futility rules in bioequivalence trials with sequential designs. AAPS J. 2014;16(1):79–82. doi:10.1208/s12248-013-9540-0
- Schütz H. Two-stage designs in bioequivalence trials. Eur J Clin Pharmacol. 2015;71(3):271-81. doi:10.1007/s00228-015-1806-2
—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!![[image]](https://static.bebac.at/pics/Blue_and_yellow_ribbon_UA.png)
Helmut Schütz
![[image]](https://static.bebac.at/img/CC by.png)
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
![[image]](https://static.bebac.at/pics/Blue_and_yellow_ribbon_UA.png)
Helmut Schütz
![[image]](https://static.bebac.at/img/CC by.png)
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- Advanced example (futility for GMR)Helmut 2015-02-27 02:03
- Advanced example (futility for GMR) d_labes 2015-02-27 09:55
- Advanced example (futility for GMR) Helmut 2015-02-27 10:43
- Advanced example (futility for GMR) d_labes 2015-02-27 09:55