## Forget simulation-based TSDs for 2×2×2 in Europe [Two-Stage / GS Designs]

Hi d_stat,

» I have a questions regarding analysis of two-stage Potvin C data, i.e. with regard to interaction stage*formulation that mentioned here:
»
» Are you aware of any literature or guidance that would suggest that poolability of stages still applies even in case of significant formulation*stage interaction?

I know only one1 stating

A term for the stage should be included in the ANOVA model. However, the guideline does not clarify what the consequence should be if it is statistically significant. In principle, the data sets of both stages could not be combined.
Although the guideline is not explicit, even if the final sample size is going to be decided based on the intra-subject variability estimated in the interim analysis, a proposal for a final sample size must be included in the protocol so that a significant number of subjects (e.g., 12) is added to the interim sample size to avoid looking twice at almost identical samples. This proposed final sample size should be recruited even if the estimation obtained from the interim analysis is lower than the one pre-defined in the protocol in order to maintain the consumer risk.

This statement lead to heated debates and a compromise in the Q&A document.2 Correct:

A model which also includes a term for a formulation*stage interaction would give equal weight to the two stages, even if the number of subjects in each stage is very different. The results can be very misleading hence such a model is not considered acceptable. Furthermore, this model assumes that the formulation effect is truly different in each stage. If such an assumption were true there is no single formulation effect that can be applied to the general population, and the estimate from the study has no real meaning.

Furthermore, none [sic] of the published methods contains a sequence(stage) term and a poolability criterion – combining is always allowed, even if a significant difference between stages is observed.
BTW, the EMA’s modification of the model was shown to be irrelevant.3

Nowadays trying ‘Method C’ in Europe is a recipe for disaster. Even ‘Method B’ is risky. For – a bit outdated – background see here and there. If you nowadays aim at a 2×2×2 crossover, opt for the exact method4 – which controls the Type I Error in the strict sense (without requiring simulations).4 It is implemented in the -package Power2Stage since April 2018. I recently faced a deficiency letter of a European agency where a study (passing BE with ‘Method B’ already in the first stage) was not accepted. Passed BE with the exact method as well…

García-Arieta A, Gordon J. Bioequivalence Requirements in the European Union: Critical Discussion. AAPS J. 2012; 14(4): 738–48. doi:10.1208/s12248-012-9382-1.
EMA, CHMP. Questions & Answers: positions on specific questions addressed to the Pharmacokinetics Working Party (PKWP). London. 19 November 2015. EMA/618604/2008 Rev. 13.
Karalis V, Macheras P. On the Statistical Model of the Two-Stage Designs in Bioequivalence Assessment. J Pharm Pharmacol. 2014; 66(1): 48–52. doi:10.1111/jphp.12164.
Maurer W, Jones B, Chen Y. Controlling the type 1 error rate in two-stage sequential designs when testing for average bioequivalence. Stat Med. 2018;1–21. doi:10.1002/sim.7614.

Dif-tor heh smusma 🖖
Helmut Schütz

### Complete thread:

