ElMaestro
★★★

Denmark,
2019-10-11 12:25
(1889 d 00:05 ago)

Posting: # 20683
Views: 6,792
 

 EU, method C preference [Two-Stage / GS Designs]

Hi all,

I remember having heard EU regulators mention preference for method C out of consideration for the type I error. But I can't seem to find a presentation from anyone saying so. Do you know, do one of you experts have a link or a presentation by a regulator where this was stated?

Many thanks.

Pass or fail!
ElMaestro
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-10-11 13:52
(1888 d 22:38 ago)

@ ElMaestro
Posting: # 20684
Views: 6,255
 

 On the contrary, my dear Dr. Watson!

Hi ElMaestro,

❝ I remember having heard EU regulators mention preference for method C out of consideration for the type I error.


What? Where?

❝ But I can't seem to find a presentation from anyone saying so.


Would surprise me if there is any.

❝ Do you […] have a link or a presentation by a regulator where this was stated?


Nope. The collaborative work about the type I error was removed from the work plan last year (Paola Coppola’s presentation at BioBridges 2018):

[image]


No work plans published this year for both parties due to Brexit. However, there is an unequivocal preference towards methods demonstrating analytically strict control of the TIE.1,2,3 In my experience European regulatory statisticians hate simulation-based methods.

On Wednesday’s workshop I endured a frustrating chat with a statistician of the Austrian agency AGES. Collection of errors and misconceptions:
  • Simulation-based methods ‘Type 2’ (e.g., Potvin C) lead to an inflated TIE.
    Wrong. Even with the original adjusted α 0.0294 only within n1 12–16 and CV 16–26%. Could easily be counteracted by a more conservative α 0.0282.
  • Kieser and Rauch4 showed that 0.0294 is not correct.
    Wrong. The authors didn’t show anything (in the sense of a proof) but lamented that 0.0294 is Pocock’s adjusted α for a superiority test (one-sided), wheras for equivalence the correct one is 0.0304. Right, though both are for a group-sequential design with a fixed sample size and one interim at exactly ½N.a That’s not what we have in a TSD with sample size re-estimation in the interim. When you inspect the electronic supplementary material of the paper – or better, perform simulations with a narrower grid – you will find a slight inflation of the TIE. In TSDs the adjustment depends on the ranges of n1 and CV, the fixed GMR, and the target power. Incidentally in Method B 0.0294 turns out to be conservative.b That’s the reason why regulators prefer B over C. For an example where Method C was not accepted see there. If you want to go with Method B, you could use an adjusted α 0.0301. Quoting the GL: “… the choice of how much alpha to spend at the interim analysis is at the company’s discretion.”
    The NLYW had a PhD in biostatistics and believed [sic] that 0.0304 is suitable in all settings. Jesus fucking Christ!
  • Simulation-based methods are basically to be rejected in principal, since there are exact methods that control the TIE.
    Well roared, lion! For 2×2 crossovers only since our posters.1,2 I have strong doubts that – given the rudimentary information – anybody ever successfully used it. The [image] scripts given by Maurer et al.3 are almost useless. Practically the method couldn’t be applied until we implemented it (THX to Ben!) in Power2Stage. In other words, it is wishful thinking that someone could have used the method before April 2018.
    An analogous version for repeated confidence intervals in parallel designs doesn’t exist at all. I don’t know anybody working on it. Not trivial for unequal group sizes and/or variances. Reply: “Doesn’t matter because parallel designs are rarely used in BE.” Wake up, girlie!
Was like talking to a brick wall or a conversation with your TV set.

Yesterday I sent a clarification  e-mail  rant to Thomas Lang (AGES, member of the BSWP). I don’t expect to get a reply. [Edit: One month later. Expectation realized.]


  1. König F, Wolfsegger M, Jaki T, Schütz H, Wassmer G. Adaptive two-stage bioequivalence trials with early stopping and sample size re-estimation. 2014. doi:10.13140/RG.2.1.5190.0967.
  2. König F, Wolfsegger M, Jaki T, Schütz H, Wassmer G. Adaptive two-stage bioequivalence trials with early stopping and sample size re-estimation. Trials. 2015; 16(Suppl 2);P218. doi:10.1186/1745-6215-16-S2-P218.
  3. Maurer W, Jones B, Chen Y. Controlling the type 1 error rate in two-stage sequential designs when testing for average bioequivalence. Stat Med. 2018; 37(10): 1587–1607. doi:10.1002/sim.7614.
  4. Kieser M, Rauch G. Two-stage designs for cross-over bioequivalence trials. Stat Med. 2015; 34(16): 2403–16. doi:10.1002/sim.6487.

  1. An all too often overlooked detail: If the interim is at <½N (due to dropouts) one has to use an error-spending function (e.g., Lan and DeMets, Jennison and Turnbull) to control the TIE.
  2. One mio simulations of a narrow grid (step size 2); TIEmax at n1 12 and CV 24%. Approximations by the shifted central t and the non­central t, exact by Owen’s Q. Go for a cup of coffee. The exact method is very slow.
    library(Power2Stage)
    method <- "B"
    alpha  <- rep(0.0294, 2)
    GMR    <- 0.95
    target <- 0.80
    n1     <- 12   # location of the
    CV     <- 0.24 # maximum empiric TIE
    power  <- c("shifted", "nct", "exact")
    res    <- data.frame(method = method, alpha = alpha[1],
                         GMR = GMR,
                         target = sprintf("%.0f%%", 100 * target),
                         power = power)
    for (j in 1:nrow(res)) {
      start        <- proc.time()[[3]]
      res$TIE[j]   <- power.tsd(method = method, alpha = alpha,
                                n1 = n1, CV = CV, GMR = GMR,
                                targetpower = target,
                                theta0 = 1.25,
                                pmethod = power[j],
                                nsims = 1e6)$pBE
      res$speed[j] <- proc.time()[[3]] - start
      cat(sprintf("%7s: %6.2f", power[j], res$speed[j]),
          "seconds\n")
    } # patience, please!
    res$speed <- signif(res$speed / res$speed[1], 3)
    print(res, row.names = FALSE)
     method  alpha  GMR target   power      TIE speed
          B 0.0294 0.95    80% shifted 0.048959  1.00
          B 0.0294 0.95    80%     nct 0.048762  1.57
          B 0.0294 0.95    80%   exact 0.048925 41.80


    With alpha = rep(0.0301, 2):
     method  alpha  GMR target   power      TIE speed
          B 0.0301 0.95    80% shifted 0.050004  1.00
          B 0.0301 0.95    80%     nct 0.049790  1.57
          B 0.0301 0.95    80%   exact 0.049786 41.30

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
nobody
nothing

2019-10-11 19:39
(1888 d 16:51 ago)

@ Helmut
Posting: # 20685
Views: 6,005
 

 On the contrary, my dear Dr. Watson!

Welcome to the wonderful world of alternative facts eeerh... scientific discussion, I meant.

Just a matter of days a we start discussions on whether it's raining outside or not.

btw. any slide show of this bio19 event for non-participants? :-)

Kindest regards, nobody
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-10-11 19:47
(1888 d 16:43 ago)

@ nobody
Posting: # 20686
Views: 6,003
 

 On the contrary, my dear Dr. Watson!

Hi nobody,

❝ Just a matter of days a we start discussions on whether it's raining outside or not.


Did you need an umbrella afterwards?

❝ btw. any slide show of this bio19 event for non-participants? :-)


They will, once we get all permissions (almost ready). Archive of last year’s presentations here.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Denmark,
2019-10-11 20:01
(1888 d 16:29 ago)

@ Helmut
Posting: # 20687
Views: 5,945
 

 On the contrary, my dear Dr. Watson!

Hi both,

❝ They will, once we get all permissions (almost ready). Archive of last year’s presentations here.


Note Paola Coppola's presentation, slide 32. :-D

I had a wtf moment of sorts when she presented that.

Suspended does not mean the parties who suspended it do not think it is important, don't let that detail fool you.

Pass or fail!
ElMaestro
nobody
nothing

2019-10-11 20:08
(1888 d 16:22 ago)

@ ElMaestro
Posting: # 20688
Views: 5,988
 

 On the contrary, my dear Dr. Watson!

...last year stuff I binge-watched one evening last year :-D

Kindest regards, nobody
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-10-11 20:11
(1888 d 16:19 ago)

@ ElMaestro
Posting: # 20689
Views: 5,936
 

 On the contrary, my dear Dr. Watson!

Hi ElMaestro,

❝ Suspended does not mean the parties who suspended it do not think it is important, don't let that detail fool you.


Old believes die hard.
It was on the work plan since 2015. In order to seriously assess the methods they would have to run own simulations which makes them :vomit:.
I don’t expect that we will ever see sumfink.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-10-19 14:38
(1880 d 21:52 ago)

@ ElMaestro
Posting: # 20705
Views: 5,987
 

 EU: simulation-based methods in agony

Hi ElMaestro & all,

an update: Yesterday at the 4th Biosimilars Forum a participant asked whether simulation-based methods (control of the type I error in a sufficiently high number of simulations, i.e., 1 mio in each cell of a narrow grid of n1/CV combinations) are acceptable. Andreas Brandt (BfArM and observer at the BSWP) answered “No.” Was also the opinion of Stephan Lehr (Austrian agency AGES). Later Andreas said that such methods might be acceptable if no alternative which shows analytically control of the TIE is available. That means for crossover designs all simulation-based methods are essentially dead – go with Ref.#3 of this post (which is implemented in Power2Stage) instead.

Then I summarized my experiences in three scientific advices about parallel designs. Tricky cause (a) exact methods don’t exist and (b) simulations have to cover a wide range of unequal group sizes and unequal variances. My simulations (‘Type 1’ TSD) covered nG1=nG2=124–250 (step size 2), CV 0.24–1.0 (step size 0.02), T/R-ratio 0.90, target power 80% = 1 mio sim’s in each of the 2,496 cells. On top of that extreme scenarios with heteroscedasticity (CV-ratios 1:4 to 4:1), each for equal and unequal group sizes (increasing dropout-rates up to ~50% in one group). Overall ~2.82 billion (!) simulations. With an adjusted α 0.0274 the maximum TIE was 0.04987. In the ‘scientific’ advices regulatory statisticians told me that it is not acceptable and claimed that exact methods exist. I asked them for publications but never received an answer.
Was also the opinion of Andreas, Stephan, and Júlia Singer. Sorry folks, mixed up non-inferiority (where repeated confidence intervals are available indeed) with equivalence (nada). Júlia promised to send me one. IMHO, would be a big surprise.* Then Andreas meant – smiling – that if nothing is published, there might still exist ones (absence of evidence is not evidence of absence). Splendid, very helpful.

A mathematician is a blind man
in a dark room looking for a black cat
which isn’t there.
   attributed to Charles Darwin



  • She sent me the papers of Anders1 and Zheng et al.2 Both preserve the TIE – based on simulations. :-D
    Only the former is for parallel designs. The latter is for asymmetric alphas (0.01, 0.04) in crossovers.

  1. Fuglsang A. Sequential Bioequivalence Approaches for Parallel Designs. AAPS J. 2014;16(3):373–8.
    doi:10.1208/s12248-014-9571-1. [image] Free Full text.
  2. Zheng C, Zhao L, Wang J. Modifications of sequential designs in bioequivalence trials. Pharm Stat. 2015;14(3):180–8. doi:10.1002/pst.1672.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
mittyri
★★  

Russia,
2019-10-20 01:24
(1880 d 11:06 ago)

@ Helmut
Posting: # 20707
Views: 5,727
 

 EU: ABEL TIE??

Hi Helmut,

❝ an update: Yesterday at the 4th Biosimilars Forum a participant asked whether simulation-based methods (control of the type I error in a sufficiently high number of simulations, i.e., 1 mio in each cell of a narrow grid of n1/CV combinations) are acceptable. Andreas Brandt (BfArM and observer at the BSWP) answered “No.”


sad story
BTW: in your lecture you suggested to adjust the CIs for ABEL due to TIE inflation. Is that also dead?
Does it mean that you cannot prove anything using sims??

Kind regards,
Mittyri
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-10-20 15:58
(1879 d 20:32 ago)

@ mittyri
Posting: # 20708
Views: 5,715
 

 Reference-scaling: only simulations possible

Hi mittyri,

❝ sad story


“So sad!” (© Mr. Trump)

❝ BTW: in your lecture you suggested to adjust the CIs for ABEL due to TIE inflation. Is that also dead?

❝ Does it mean that you cannot prove anything using sims??


I hope not. For all reference-scaling methods we need already simulations to estimate the sample size. It would be strange to allow them here but not for the TIE. RSABE/ABEL is tricky anyhow. The model is based on (population) parameters $$-\theta_s\leq\tfrac{\mu_T-\mu_R}{\sigma_{wR}}\leq+\theta_s$$ which are unknown. We have only their estimates. Exactly this misspecification (apply scaling although the drug is not highly variable) leads to the inflated TIE. As I wrote above, Andreas Brandt said that

[simulation-based] methods might [sic] be acceptable if no alternative which shows analytically control of the TIE is available”.

Clearly the case here. In the current implementation reference-scaling is a framework with two (RSABE) and three (ABEL) decisions. No way to solve that analytically (given, at least the GMR-restriction could be implemented by setting α 0.5).
Our method2 follows the ‘spirit’ of the GL, i.e., we assume that \(s_{wR}=\sigma_{wR}\). Molins et al.2 proposed to assume the worst, i.e., regardless of \(s_{wR}\) adjust α as if \(CV_{wR}=0.30\). Conservative but it has a substantial negative impact on power (esp. for really high variability where an inflated TIE is unlikely). For examples see the RSABE vignette of the working version of the next release of PowerTOST and R-code at the end. See also this article.

What is better? Rely on the ad hoc solutions (which are not perfect) or follow the book, ignore the inflation and put the patients in jeopardy?


  1. Labes D, Schütz H. Inflation of Type I Error in the Evaluation of Scaled Average Bioequivalence, and a Method for its Control. Pharm Res. 2016: 33(11); 2805–14. doi:10.1007/s11095-016-2006-1.
  2. Molins E, Cobo E, Ocaña J. Two-Stage Designs Versus European Scaled Average Designs in Bioequivalence Studies for Highly Variable Drugs: Which to Choose? Stat Med. 2017: 36(30); 4777–88. doi:10.1002/sim.7452.

library(PowerTOST)
CV  <- c(0.35, 0.80) # in-/outside region of inflation
des <- "2x2x4"
n   <- pwr <- numeric()
for (j in seq_along(CV)) {
  x      <- sampleN.scABEL(CV = CV[j], design = des,
                           print = FALSE, details = FALSE)
  n[j]   <- x[["Sample size"]]
  pwr[j] <- x[["Achieved power"]]
}
res <- data.frame(CV = rep(CV, each = 2), n = rep(n, each = 2),
                  TIE = NA, power = rep(pwr, each = 2),
                  method = rep(c("Labes and Schütz", "Molins et al."), 2),
                  alpha.adj = NA, TIE.adj = NA, pwr.adj = NA,
                  stringsAsFactors = FALSE)
for (j in 1:nrow(res)) {
  if (res$method[j] == "Labes and Schütz") {
    x <- scABEL.ad(CV = res$CV[j], design = des, n = res$n[j], print = FALSE)
    res$pwr.adj[j] <- x$pwr.adj
  } else {
    x <- scABEL.ad(CV = 0.30, design = des, n = res$n[j], print = FALSE)
    res$pwr.adj[j] <- power.scABEL(alpha = x$alpha.adj, CV = res$CV[j],
                                   design = des, n = res$n[j])
  }
  res$TIE[j]       <- x$TIE.unadj
  res$alpha.adj[j] <- x$alpha.adj
  res$TIE.adj[j]   <- x$TIE.adj
}
print(res, row.names = FALSE)

  CV  n      TIE   power           method alpha.adj TIE.adj pwr.adj
0.35 34 0.065566 0.81184 Labes and Schütz  0.036299    0.05 0.77281
0.35 34 0.081626 0.81184    Molins et al.  0.028572    0.05 0.74046
0.80 50 0.049600 0.81235 Labes and Schütz        NA      NA      NA
0.80 50 0.082115 0.81235    Molins et al.  0.028201    0.05 0.73198


Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
UA Flag
Activity
 Admin contact
23,336 posts in 4,902 threads, 1,666 registered users;
46 visitors (1 registered, 45 guests [including 8 identified bots]).
Forum time: 11:30 CET (Europe/Vienna)

Biostatistician. One who has neither the intellect for mathematics
nor the commitment for medicine but likes to dabble in both.    Stephen Senn

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5