## On the contrary, my dear Dr. Watson! [Two-Stage / GS Designs]

Hi ElMaestro,

What? Where?

Would surprise me if there is any.

Nope. The collaborative work about the type I error was removed from the work plan last year (Paola Coppola’s presentation at BioBridges 2018):

No work plans published this year for both parties due to Brexit. However, there is an unequivocal preference towards methods demonstrating

On Wednesday’s workshop I endured a frustrating chat with a statistician of the Austrian agency AGES. Collection of errors and misconceptions:

Yesterday I sent a clarification~~ e-mail ~~ rant to Thomas Lang (AGES, member of the BSWP). I don’t expect to get a reply. [Edit: One month later. Expectation realized.]

❝ I remember having heard EU regulators mention preference for method C out of consideration for the type I error.

What? Where?

❝ But I can't seem to find a presentation from anyone saying so.

Would surprise me if there is any.

❝ Do you […] have a link or a presentation by a regulator where this was stated?

Nope. The collaborative work about the type I error was removed from the work plan last year (Paola Coppola’s presentation at BioBridges 2018):

No work plans published this year for both parties due to Brexit. However, there is an unequivocal preference towards methods demonstrating

*analytically*strict control of the TIE.^{1,2,3}In my experience European regulatory statisticians*hate*simulation-based methods.On Wednesday’s workshop I endured a frustrating chat with a statistician of the Austrian agency AGES. Collection of errors and misconceptions:

- Simulation-based methods ‘Type 2’ (
*e.g.*, Potvin C) lead to an inflated TIE.

Wrong. Even with the original adjusted α 0.0294 only within n_{1}12–16 and CV 16–26%. Could easily be counteracted by a more conservative α 0.0282.

- Kieser and Rauch
^{4}showed that 0.0294 is not correct.

Wrong. The authors didn’t*show*anything (in the sense of a proof) but*lamented*that 0.0294 is Pocock’s adjusted α for a superiority test (one-sided), wheras for equivalence the correct one is 0.0304. Right, though both are for a group-sequential design with a fixed sample size and one interim at exactly ½N.^{a}That’s not what we have in a TSD with sample size re-estimation in the interim. When you inspect the electronic supplementary material of the paper – or better, perform simulations with a narrower grid – you will find a slight inflation of the TIE. In TSDs the adjustment depends on the ranges of n_{1}and CV, the fixed GMR, and the target power.*Incidentally*in Method B 0.0294 turns out to be conservative.^{b}That’s the reason why regulators prefer B over C. For an example where Method C was not accepted see there. If you want to go with Method B, you could use an adjusted α 0.0301. Quoting the GL: “… the choice of how much alpha to spend at the interim analysis is at the company’s discretion.”

The NLYW had a PhD in biostatistics and believed [*sic*] that 0.0304 is suitable in all settings. Jesus fucking Christ!

- Simulation-based methods are basically to be rejected in principal, since there are exact methods that control the TIE.

Well roared, lion! For 2×2 crossovers only since our posters.^{1,2}I have strong doubts that – given the rudimentary information – anybody ever successfully used it. The scripts given by Maurer*et al.*^{3}are almost useless. Practically the method couldn’t be applied until we implemented it (THX to Ben!) in`Power2Stage`

. In other words, it is wishful thinking that someone could have used the method before April 2018.

An analogous version for repeated confidence intervals in parallel designs doesn’t exist at all. I don’t know anybody working on it. Not trivial for unequal group sizes and/or variances. Reply: “Doesn’t matter because parallel designs are rarely used in BE.” Wake up, girlie!

Yesterday I sent a clarification

- König F, Wolfsegger M, Jaki T, Schütz H, Wassmer G.
*Adaptive two-stage bioequivalence trials with early stopping and sample size re-estimation.*2014. doi:10.13140/RG.2.1.5190.0967.

- König F, Wolfsegger M, Jaki T, Schütz H, Wassmer G.
*Adaptive two-stage bioequivalence trials with early stopping and sample size re-estimation.*Trials. 2015; 16(Suppl 2);P218. doi:10.1186/1745-6215-16-S2-P218.

- Maurer W, Jones B, Chen Y.
*Controlling the type 1 error rate in two-stage sequential designs when testing for average bioequivalence.*Stat Med. 2018; 37(10): 1587–1607. doi:10.1002/sim.7614.

- Kieser M, Rauch G.
*Two-stage designs for cross-over bioequivalence trials.*Stat Med. 2015; 34(16): 2403–16. doi:10.1002/sim.6487.

- An all too often overlooked detail: If the interim is at <½N (due to dropouts) one has to use an error-spending function (
*e.g.*, Lan and DeMets, Jennison and Turnbull) to control the TIE.

- One mio simulations of a narrow grid (step size 2); TIE
_{max}at n_{1}12 and CV 24%. Approximations by the shifted central*t*and the noncentral*t*, exact by Owen’s Q. Go for a cup of coffee. The exact method is very slow.

`library(Power2Stage)`

method <- "B"

alpha <- rep(0.0294, 2)

GMR <- 0.95

target <- 0.80

n1 <- 12 # location of the

CV <- 0.24 # maximum empiric TIE

power <- c("shifted", "nct", "exact")

res <- data.frame(method = method, alpha = alpha[1],

GMR = GMR,

target = sprintf("%.0f%%", 100 * target),

power = power)

for (j in 1:nrow(res)) {

start <- proc.time()[[3]]

res$TIE[j] <- power.tsd(method = method, alpha = alpha,

n1 = n1, CV = CV, GMR = GMR,

targetpower = target,

theta0 = 1.25,

pmethod = power[j],

nsims = 1e6)$pBE

res$speed[j] <- proc.time()[[3]] - start

cat(sprintf("%7s: %6.2f", power[j], res$speed[j]),

"seconds\n")

} # patience, please!

res$speed <- signif(res$speed / res$speed[1], 3)

print(res, row.names = FALSE)

method alpha GMR target power TIE speed

B 0.0294 0.95 80% shifted 0.048959 1.00

B 0.0294 0.95 80% nct 0.048762 1.57

B 0.0294 0.95 80% exact 0.048925 41.80

With`alpha = rep(0.0301, 2)`

:

`method alpha GMR target power TIE speed`

B 0.0301 0.95 80% shifted 0.050004 1.00

B 0.0301 0.95 80% nct 0.049790 1.57

B 0.0301 0.95 80% exact 0.049786 41.30

—

Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮

Science Quotes

*Dif-tor heh smusma*🖖🏼 Довге життя Україна!_{}Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮

Science Quotes

### Complete thread:

- EU, method C preference ElMaestro 2019-10-11 10:25 [Two-Stage / GS Designs]
- On the contrary, my dear Dr. Watson!Helmut 2019-10-11 11:52
- On the contrary, my dear Dr. Watson! nobody 2019-10-11 17:39
- On the contrary, my dear Dr. Watson! Helmut 2019-10-11 17:47
- On the contrary, my dear Dr. Watson! ElMaestro 2019-10-11 18:01
- On the contrary, my dear Dr. Watson! nobody 2019-10-11 18:08
- On the contrary, my dear Dr. Watson! Helmut 2019-10-11 18:11

- On the contrary, my dear Dr. Watson! ElMaestro 2019-10-11 18:01

- On the contrary, my dear Dr. Watson! Helmut 2019-10-11 17:47

- On the contrary, my dear Dr. Watson! nobody 2019-10-11 17:39
- EU: simulation-based methods in agony Helmut 2019-10-19 12:38
- EU: ABEL TIE?? mittyri 2019-10-19 23:24
- Reference-scaling: only simulations possible Helmut 2019-10-20 13:58

- EU: ABEL TIE?? mittyri 2019-10-19 23:24

- On the contrary, my dear Dr. Watson!Helmut 2019-10-11 11:52