## But what is the real problem? [Two-Stage / GS Designs]

» We have a drug and do not know it CV. We take an arbitrary sample …

Well, I always try to make an educated guess of the CV. If you aim too low in the first stage, the sample size penalty in the second stage will be larger.

Example: “Guesstimate” CV 25%, Potvin B, n

_{1}12 or 24.

`library(Power2Stage)`

power.tsd(CV=0.25, n1=12)

TSD with 2x2 crossover

Method B: alpha (s1/s2) = 0.0294 0.0294

Target power in power monitoring and sample size est. = 0.8

Power calculation via non-central t approx.

CV1 and GMR = 0.95 in sample size est. used

No futility criterion

BE acceptance range = 0.8 ... 1.25

CV = 0.25; n(stage 1) = 12; GMR = 0.95

1e+05 sims at theta0 = 0.95 (p(BE) = 'power').

p(BE) = 0.81438

p(BE) s1 = 0.17895

Studies in stage 2 = 81.38%

Distribution of n(total)

- mean (range) = 32.4 (12 ... 126)

- percentiles

5% 50% 95%

12 32 60

power.tsd(CV=0.25, n1=24)

TSD with 2x2 crossover

Method B: alpha (s1/s2) = 0.0294 0.0294

Target power in power monitoring and sample size est. = 0.8

Power calculation via non-central t approx.

CV1 and GMR = 0.95 in sample size est. used

No futility criterion

BE acceptance range = 0.8 ... 1.25

CV = 0.25; n(stage 1) = 24; GMR = 0.95

1e+05 sims at theta0 = 0.95 (p(BE) = 'power').

p(BE) = 0.84244

p(BE) s1 = 0.63203

Studies in stage 2 = 33.56%

Distribution of n(total)

- mean (range) = 29 (24 ... 86)

- percentiles

5% 50% 95%

24 24 48

With 12 subjects in the first stage on the average you will have a total sample size of 32.4 (median 32) and with 24 only 29 (median 24). In the latter case you have already a chance of 63% to show BE in the first stage and in the former only 18%.

» … and then calculate the CV and real GMR in the first stage.

Yep. But in ‘Type 1’ TSDs you generally

*ignore*the observed GMR and work with a

*fixed*(assumed) T/R-ratio.

» In the second stage, we, if I understand correctly, should calculate post-hoc power and recalculate the sample size with the data (GMR and CV) obtained in the first stage, if bioequivalence has not been achieved in the first stage.

Nope. You calculate

*interim*power after the first stage. If you want to use the GMR of the first stage as well (go fully adaptive) you might shoot yourself in the foot. Practically you need

*two*futility criteria:

- Stop if the GMR is outside [0.80, 1.25].

- Stop if the re-estimated sample size is above a pre-specified limit (U).

_{1}24 but not with n

_{1}12 (power only ~73%). There are alternatives where you don’t

*stop*if n

_{1}+n

_{2}>U but perform the the second stage in U–n

_{2}subjects. No problem with the Type I Error but might compromise power; I suggest simulations.

» But what to do, if we have got bad GMR(0,83), whatever CV and low power (around 30) in the first stage.

You are free to include futility criteria for early stopping in the method. You don’t have to worry about the adjusted α because any futility criterion decreases the patient’s risk.

» In this case, as I understand, we should recruit much more subjects according to our protocol.

In general you should not give a total sample size in the protocol – unless it is part of the framework (simulations recommended: have an eye on power).

If you are courageous try the Inverse-Normal Combination Method / Maximum Combination Test (Maurer

*et al.*2018).

Pro: Proven to preserve the Type I Error (makes regulatory statisticians happy).

Con: Might be the first time they ever have seen sumfink like this. Expect questions.

Example: Like above but two futility criteria: GMR within [0.8, 1.25] and maximum total sample size 120. GMR observed in the first stage used (fully adaptive).

`library(Power2Stage)`

power.tsd.in(CV=0.25, n1=24, usePE=TRUE, fClower=0.8, fCupper=1.25,

fCNmax=120, fCrit=c("PE", "Nmax"))

TSD with 2x2 crossover

Inverse Normal approach

- maximum combination test (weights = 0.5 0.25)

- alpha (s1/s2) = 0.02635 0.02635

- critical value (s1/s2) = 1.93741 1.93741

- with conditional error rates and conditional power

Overall target power = 0.8

Threshold in power monitoring step for futility = 0.8

Power calculation via non-central t approx.

CV1 and PE1 in sample size est. used

Futility criterion Nmax = 120

Futility criterion PE outside 0.8 ... 1.25

Minimum sample size in stage 2 = 4

BE acceptance range = 0.8 ... 1.25

CV = 0.25; n(stage 1) = 24; GMR = 0.95

1e+05 sims at theta0 = 0.95 (p(BE) = 'power').

p(BE) = 0.84312

p(BE) s1 = 0.60784

Studies in stage 2 = 30.03%

Distribution of n(total)

- mean (range) = 31.3 (24 ... 120)

- percentiles

5% 50% 95%

24 24 68

» The questions, how can we avoid slipping into "forced bioequivalence"? Or should we go straight away and recruit this large number of subjects? What should be put into the protocol?

As ElMaestro wrote above, I don’t see how you could run into “forced BE”.

- No defined total sample size. You perform the second stage in the re-estimated sample size.

- If at the end the CV is lower and/or the GMR “better” than assumed, be happy.
*Post hoc*power is irrelevant. Open a bottle of champagne.

Cheers,

Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮

Science Quotes

### Complete thread:

- Two-stage design and 'forced bioequivalence' Mikalai 2018-06-06 08:28
- Two-stage design and 'forced bioequivalence' ElMaestro 2018-06-06 10:53
- Two-stage design and 'forced bioequivalence' Yura 2018-06-07 10:24
- But what is the real problem? ElMaestro 2018-06-07 13:53
- But what is the real problem? Yura 2018-06-07 14:59
- But what is the real problem? Mikalai 2018-06-07 15:47
- But what is the real problem?Helmut 2018-06-07 17:33
- But what is the real problem? Mikalai 2018-06-08 12:24
- U as a futility criterion Helmut 2018-06-08 14:00

- But what is the real problem? Mikalai 2018-06-08 12:24

- But what is the real problem?Helmut 2018-06-07 17:33

- But what is the real problem? Mikalai 2018-06-07 15:47

- But what is the real problem? Yura 2018-06-07 14:59

- But what is the real problem? ElMaestro 2018-06-07 13:53

- Two-stage design and 'forced bioequivalence' Yura 2018-06-07 10:24

- Two-stage design and 'forced bioequivalence' ElMaestro 2018-06-06 10:53