ElMaestro
★★★

Belgium?,
2019-10-11 10:25

Posting: # 20683
Views: 1,088

## EU, method C preference [Two-Stage / GS Designs]

Hi all,

I remember having heard EU regulators mention preference for method C out of consideration for the type I error. But I can't seem to find a presentation from anyone saying so. Do you know, do one of you experts have a link or a presentation by a regulator where this was stated?

Many thanks.

I could be wrong, but...
Best regards,
ElMaestro
Helmut
★★★

Vienna, Austria,
2019-10-11 11:52

@ ElMaestro
Posting: # 20684
Views: 1,033

## On the contrary, my dear Dr Watson!

Hi ElMaestro,

» I remember having heard EU regulators mention preference for method C out of consideration for the type I error.

What? Where?

» But I can't seem to find a presentation from anyone saying so.

Would surprise me if there is any.

» Do you […] have a link or a presentation by a regulator where this was stated?

Nope. The collaborative work about the type I error was removed from the work plan last year (Paola Coppola’s presentation at BioBridges 2018):

No work plans published this year for both parties due to Brexit. However, there is an unequivocal preference towards methods which show analytically strict control of the TIE.1,2,3 In my experience European regulatory statisticians hate simulation-based methods.

On Wednesday’s workshop I endured a frustrating chat with a statistician of the Austrian agency AGES. Collection of errors and misconceptions:
• Simulation-based methods ‘type 2’ (e.g., Potvin C) lead to an inflated TIE.
Wrong. Even with the original adjusted α 0.0294 only within n1 12–16 and CV 16–26%. Could easily be counteracted by a more conservative adjusted α 0.0282.
• Kieser and Rauch4 showed that 0.0294 is not correct.
Wrong. The authors didn’t show anything (in the sense of a proof) but lamented that 0.0294 is Pocock’s adjusted α for a one-sided test and for equivalence the correct one is 0.0304. Right but both are for a group-sequential design with a fixed sample size and one interim at exactly ½N.a That’s not what we have in a TSD with sample size re-estimation in the interim. When you inspect the electronic supplementary material of the paper (or better perform simulations with a narrower grid) you will find a slight inflation of the TIE. In TSDs the adjustment depends on the ranges of n1 and CV, the fixed T/R-ratio, and the desired power. Incidentally in Method B 0.0294 turns out to be conservative.b That’s the reason why regulators prefer B over C. For an example where Method C was not accepted see there. If you want to go with Method B, you could use an adjusted α 0.0301. Quoting the GL: “… the choice of how much alpha to spend at the interim analysis is at the company’s discretion.”
The NLYW had a PhD in biostatistics and believed [sic] that 0.0304 is suitable in all settings. Jesus fucking Christ!
• Simulation-based methods are basically to be rejected in principal, since there are exact methods that control the TIE.
Well roared, lion! For 2×2 crossovers only since our posters.1,2 I have strong doubts that – given the rudimentary information – anybody ever successfully used it. The R-code given by Maurer et al.3 is almost useless. Practically the method couldn’t be applied until we implemented it (THX to Ben!) in Power2Stage. In other words, that someone could have used the method before mid-2018 is wishful thinking.
An analogous version for repeated confidence intervals in parallel designs doesn’t exist at all. I don’t know anybody working on it. Not trivial for unequal group sizes and/or variances. Reply: “Doesn’t matter because parallel designs are rarely used in BE.” Wake up, girlie!
Was like talking to a brick wall.

Yesterday I sent a clarification  e-mail  rant to Thomas Lang (AGES, member of the BSWP). Don’t expect to get a reply.

1. König F, Wolfsegger M, Jaki T, Schütz H, Wassmer G. Adaptive two-stage bioequivalence trials with early stopping and sample size re-estimation. 2014. doi:10.13140/RG.2.1.5190.0967.
2. König F, Wolfsegger M, Jaki T, Schütz H, Wassmer G. Adaptive two-stage bioequivalence trials with early stopping and sample size re-estimation. Trials. 2015; 16(Suppl 2);P218. doi:10.1186/1745-6215-16-S2-P218.
3. Maurer W, Jones B, Chen Y. Controlling the type 1 error rate in two-stage sequential designs when testing for average bioequivalence. Stat Med. 2018; 37(10): 1587–1607. doi:10.1002/sim.7614.
4. Kieser M, Rauch G. Two-stage designs for cross-over bioequivalence trials. Stat Med. 2015; 34(16): 2403–16. doi:10.1002/sim.6487.

1. An all too often overlooked detail: If the interim is at <½N (due to dropouts) one has to use an error-spending function (e.g., Lan and DeMets, Jennison and Turnbull) to control the TIE.
2. One mio simulations of a narrow grid (step size 2); TIEmax at n1 12 and CV 24%. Approximations by the shifted central t and the non­central t, exact by Owen’s Q. Go for a cup of coffee. The exact method is very slow.
library(Power2Stage) pmethod <- c("shifted", "nct", "exact") res     <- data.frame(method = pmethod, TIE = NA, speed = NA) for (j in seq_along(pmethod)) {   start        <- proc.time()[[3]]   res$TIE[j] <- power.tsd(method = "B", alpha = rep(0.0294, 2), n1 = 12, CV = 0.24, GMR = 0.95, targetpower = 0.80, theta0 = 1.25, pmethod = pmethod[j], nsims = 1e6)$pBE   res$speed[j] <- proc.time()[[3]] - start } res$speed <- signif(res$speed / res$speed[1], 3) print(res, row.names = FALSE)  method      TIE speed shifted 0.048959  1.00     nct 0.048762  1.44   exact 0.048924 28.40

With alpha = rep(0.0301, 2):
 method      TIE speed shifted 0.050004  1.00     nct 0.049790  1.44   exact 0.049693 28.50

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
nobody
nothing

2019-10-11 17:39

@ Helmut
Posting: # 20685
Views: 925

## On the contrary, my dear Dr Watson!

Welcome to the wonderful world of alternative facts eeerh... scientific discussion, I meant.

Just a matter of days a we start discussions on whether it's raining outside or not.

btw. any slide show of this bio19 event for non-participants?

Kindest regards, nobody
Helmut
★★★

Vienna, Austria,
2019-10-11 17:47

@ nobody
Posting: # 20686
Views: 933

## On the contrary, my dear Dr Watson!

Hi nobody,

» Just a matter of days a we start discussions on whether it's raining outside or not.

Did you need an umbrella afterwards?

» btw. any slide show of this bio19 event for non-participants?

They will, once we get all permissions (almost ready). Archive of last year’s presentations here.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Belgium?,
2019-10-11 18:01

@ Helmut
Posting: # 20687
Views: 920

## On the contrary, my dear Dr Watson!

Hi both,

» They will, once we get all permissions (almost ready). Archive of last year’s presentations here.

Note Paola Coppola's presentation, slide 32

I had a wtf moment of sorts when she presented that.

Suspended does not mean the parties who suspended it do not think it is important, don't let that detail fool you.

I could be wrong, but...
Best regards,
ElMaestro
nobody
nothing

2019-10-11 18:08

@ ElMaestro
Posting: # 20688
Views: 918

## On the contrary, my dear Dr Watson!

...last year stuff I binge-watched one evening last year

Kindest regards, nobody
Helmut
★★★

Vienna, Austria,
2019-10-11 18:11

@ ElMaestro
Posting: # 20689
Views: 910

## On the contrary, my dear Dr Watson!

Hi ElMaestro,

» Suspended does not mean the parties who suspended it do not think it is important, don't let that detail fool you.

Old believes die hard.
It was on  work plan since 2015. In order to seriously assess the methods they would have to run own simulations which makes them .
I don’t expect that we will ever see sumfink.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Helmut
★★★

Vienna, Austria,
2019-10-19 12:38

@ ElMaestro
Posting: # 20705
Views: 704

## EU: simulation-based methods in agony

Hi ElMaestro & all,

an update: Yesterday at the 4th Biosimilars Forum a participant asked whether simulation-based methods (control of the type I error in a sufficiently high number of simulations, i.e., 1 mio in each cell of a narrow grid of n1/CV combinations) are acceptable. Andreas Brandt (BfArM and observer at the BSWP) answered “No.” Was also the opinion of Stephan Lehr (Austrian agency AGES). Later Andreas said that such methods might be acceptable if no alternative which shows analytically control of the TIE is available. That means for crossover designs all simulation-based methods are essentially dead – go with Ref.#3 of this post (which is implemented in Power2Stage) instead.

Then I summarized my experiences in three scientific advices about parallel designs. Tricky cause (a) exact methods don’t exist and (b) simulations have to cover a wide range of unequal group sizes and unequal variances. My simulations (‘Type 1’ TSD) covered nG1=nG2=124–250 (step size 2), CV 0.24–1.0 (step size 0.02), T/R-ratio 0.90, target power 80% = 1 mio sim’s in each of the 2,496 cells. On top of that extreme scenarios with heteroscedasticity (CV-ratios 1:4 to 4:1), each for equal and unequal group sizes (increasing dropout-rates up to ~50% in one group). Overall ~2.82 billion (!) simulations. With an adjusted α 0.0274 the max. TIE was 0.04987. In the scientific advices regulatory statisticians told me that it is not acceptable and claimed that exact methods exist. I asked them for publications but never received an answer.
Was also the opinion of Andreas, Stephan, and Júlia Singer. Sorry folks, mixed up non-inferiority (where repeated confidence intervals are available indeed) with equivalence (nada). Júlia promised to send me one. IMHO, would be a big surprise.* Then Andreas meant – smiling – that if nothing is published, there might still exist ones (absence of evidence is not evidence of absence). Splendid, very helpful.

A mathematician is a blind man
in a dark room looking for a black cat
which isn’t there.
attributed to Charles Darwin

• She sent me the papers of Anders1 and Zheng et al.2 Both preserve the TIE – based on simulations.
Only the former for parallel designs. The latter with asymmetric alphas (0.01, 0.04) in crossovers.

1. Fuglsang A. Sequential Bioequivalence Approaches for Parallel Designs. AAPS J. 2014;16(3):373–8.
doi:10.1208/s12248-014-9571-1.
2. Zheng C, Zhao L, Wang J. Modifications of sequential designs in bioequivalence trials. Pharm Stat. 2015;14(3):180–8. doi:10.1002/pst.1672.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
mittyri
★★

Russia,
2019-10-19 23:24

@ Helmut
Posting: # 20707
Views: 643

## EU: ABEL TIE??

Hi Helmut,

» an update: Yesterday at the 4th Biosimilars Forum a participant asked whether simulation-based methods (control of the type I error in a sufficiently high number of simulations, i.e., 1 mio in each cell of a narrow grid of n1/CV combinations) are acceptable. Andreas Brandt (BfArM and observer at the BSWP) answered “No.”

BTW: in your lecture you suggested to adjust the CIs for ABEL due to TIE inflation. Is that also dead?
Does it mean that you cannot prove anything using sims??

Kind regards,
Mittyri
Helmut
★★★

Vienna, Austria,
2019-10-20 13:58

@ mittyri
Posting: # 20708
Views: 628

## Reference-scaling: only simulations possible

Hi mittyri,

» BTW: in your lecture you suggested to adjust the CIs for ABEL due to TIE inflation. Is that also dead?
» Does it mean that you cannot prove anything using sims??

I hope not. For all reference-scaling methods we need already simulations to estimate the sample size. It would be strange to allow them here but not for the TIE. RSABE/ABEL is tricky anyhow. The model is based on (population) parameters $$-\theta_s\leq\tfrac{\mu_T-\mu_R}{\sigma_{wR}}\leq+\theta_s$$ which are unknown. We have only their estimates. Exactly this misspecification (apply scaling although the drug is not highly variable) leads to the inflated TIE. As I wrote above, Andreas Brandt said that

[simulation-based] methods might [sic] be acceptable if no alternative which shows analytically control of the TIE is available”.

Clearly the case here. In the current implementation reference-scaling is a framework with two (RSABE) and three (ABEL) decisions. No way to solve that analytically (given, at least the GMR-restriction could be implemented by setting α 0.5).
Our method2 follows the ‘spirit’ of the GL, i.e., we assume that $$s_{wR}=\sigma_{wR}$$. Molins et al.2 proposed to assume the worst, i.e., regardless of $$s_{wR}$$ adjust α as if $$CV_{wR}=0.30$$. Conservative but it has a substantial negative impact on power (esp. for really high variability where an inflated TIE is unlikely). For examples see the RSABE vignette of the working version of the next release of PowerTOST and R-code at the end.

What is better? Rely on the ad hoc solutions (which are not perfect) or follow the book, ignore the inflation and put the patients in jeopardy?

1. Labes D, Schütz H. Inflation of Type I Error in the Evaluation of Scaled Average Bioequivalence, and a Method for its Control. Pharm Res. 2016: 33(11); 2805–14. doi:10.1007/s11095-016-2006-1.
2. Molins E, Cobo E, Ocaña J. Two-Stage Designs Versus European Scaled Average Designs in Bioequivalence Studies for Highly Variable Drugs: Which to Choose? Stat Med. 2017: 36(30); 4777–88. doi:10.1002/sim.7452.

library(PowerTOST) CV  <- c(0.35, 0.80) # in-/outside region of inflation des <- "2x2x4" n   <- pwr <- numeric() for (j in seq_along(CV)) {   x      <- sampleN.scABEL(CV = CV[j], design = des,                            print = FALSE, details = FALSE)   n[j]   <- x[["Sample size"]]   pwr[j] <- x[["Achieved power"]] } res <- data.frame(CV = rep(CV, each = 2), n = rep(n, each = 2),                   TIE = NA, power = rep(pwr, each = 2),                   method = rep(c("Labes and Schütz", "Molins et al."), 2),                   alpha.adj = NA, TIE.adj = NA, pwr.adj = NA,                   stringsAsFactors = FALSE) for (j in 1:nrow(res)) {   if (res$method[j] == "Labes and Schütz") { x <- scABEL.ad(CV = res$CV[j], design = des, n = res$n[j], print = FALSE) res$pwr.adj[j] <- x$pwr.adj } else { x <- scABEL.ad(CV = 0.30, design = des, n = res$n[j], print = FALSE)     res$pwr.adj[j] <- power.scABEL(alpha = x$alpha.adj, CV = res$CV[j], design = des, n = res$n[j])   }   res$TIE[j] <- x$TIE.unadj   res$alpha.adj[j] <- x$alpha.adj   res$TIE.adj[j] <- x$TIE.adj } print(res, row.names = FALSE)   CV  n      TIE   power           method alpha.adj TIE.adj pwr.adj 0.35 34 0.065566 0.81184 Labes and Schütz  0.036299    0.05 0.77281 0.35 34 0.081626 0.81184    Molins et al.  0.028572    0.05 0.74046 0.80 50 0.049600 0.81235 Labes and Schütz        NA      NA      NA 0.80 50 0.082115 0.81235    Molins et al.  0.028201    0.05 0.73198

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes