Helmut
★★★
avatar
Homepage
Vienna, Austria,
2015-05-06 04:17
(3249 d 19:15 ago)

Posting: # 14753
Views: 13,807
 

 TSDs for RSABE/ABEL [Two-Stage / GS Designs]

To whom it may concern!

Since Two-Stage Designs were mentioned in various guidelines (EMA, FDA,…) sponsors want to combine them with options for HVDs/HVDPs (EMA: ABEL, FDA: RSABE). I do understand that this a tempting idea. What I have heard at recent conferences gives me the impression that sponsors already performed such studies. I don’t know whether any one of them was accepted. I know of one leading to a deficiency letter.

I will summarize the caveats.
  • Reference-scaling might lead to an inflation of the type I error (TIE) – caused by modifying the Null-Hy­po­thesis “in the face of the data”. The method is intrinsically sequential:
    • Estimate the intra-subject variability of the reference.
    • Eventually widen the acceptance-range (EMA) or assess the linearized criterion (FDA).
  • Chow and Liu1 state:
    […] the equivalence limits for the scaled ABE involve the unknown intra-subject variability. As a result, the equivalence limits become random variables and are not fixed constants and the vari­ability introduced by scaling will have an impact on the type I error rate. Therefore, the attempt to use the scaled ABE for re­so­lution of high variable drug products will face the similar issues and challenges that the individual and population bioequivalence encountered in the 1990s […]. These issues must be satisfactorily addressed before the scaled ABE can be implemented into regulatory framework.
  • Inflation of the TIE was already reported by Endrényi and Tóthfalusi2 and recently demonstrated by Won­ne­mann et al.3 for the EMA’s method by mans of simulations.
  • So far Two-Stage Sequential methods are only published for 2×2×2 crossovers and studies in a two-group parallel design. See the review4 for references. With one exception (“Type 2” TSDs showing BE in the first stage with at least the target power) all of them require an adjustment of the nominal α in order to maintain the overall TIE at ≤0.05.
    Though attempts have been made to adjust α based on the first stage’s data5,6 regulatory accept­ance is prob­lem­atic, since – at least for the EMA – the adjusted level has to be pre-specified in the protocol.
  • Until somebody comes up with a clever algo and publishes it, the combination of TSDs with refe­rence-scaling remains an open issue. It boils down to a Radio Yerevan’s answer: “In principle yes, but you have to demonstrate that the patient’s risk is maintained.”
    If one wants to pre-specify an adjusted α (GL…) one has to (empirically) find one which keeps the TIE for all possible combinations of stage 1 sample sizes and CVs. Currently I use a matrix with ~1,000 combos (n1 and CV with a step size of 2). For every grid-point one has to simulate 106 BE-studies for the TIE (slow con­ver­gence) and 105 for power. This requires 1.1×109 simulations.
    Contrary to 2×2×2 and parallel designs – where power can be directly calculated – for scaling we need simu­la­tions since the limits are not fixed, we have a restriction on the GMR, and an upper cap at 50% for EMA. The convergence is fine, but we still need 105 simulations within every simulated study. This leads to ~1014 (110,000,000,000,000‼) simulations overall.
    To give you an idea:

    run0 <- 1e7; run  <- 1:run0
    ptm  <- proc.time() # time the loop's overhead
    for(j in seq_along(run)) { }
    t0   <- as.numeric(1000*(proc.time()[3]-ptm[3])/run0)
    run1 <- 1e4; run  <- 1:run1
    ptm  <- proc.time()
    for(j in seq_along(run)) {
      power.TOST(CV=0.4, theta0=0.9, design="2x2x2", n=134, method="nct") }
    t1   <- as.numeric(1000*(proc.time()[3]-ptm[3])/run1-t0)
    run2 <- 1e3; run  <- 1:run2
    ptm  <- proc.time() # ABEL (EMA)
    for(j in seq_along(run)) {
      power.scABEL(CV=0.4, theta0=0.9, design="2x2x4", n=30, nsims=1e5) }
    t2   <- as.numeric(1000*(proc.time()[3]-ptm[3])/run2-t0)
    run3 <- 1e3; run  <- 1:run3
    ptm  <- proc.time() # RSABE (FDA)
    for(j in seq_along(run)) {
      power.RSABE(CV=0.4, theta0=0.9, design="2x2x4", n=24, nsims=1e5) }
    t3   <- as.numeric(1000*(proc.time()[3]-ptm[3])/run3-t0)
    cat("Runtimes of PowerTOST's functions:\n",
    sprintf("%s %5.2f %s", "2x2x2 ABE        :", t1, "ms\n"),
    sprintf("%s %5.2f %s (%1.0fx slower)%s", "2x2x4 ABEL  (EMA):", t2, "ms", t2/t1, "\n"),
    sprintf("%s %5.2f %s (%1.0fx slower)%s", "2x2x4 RSABE (FDA):", t3, "ms", t3/t1, "\n"))


    Gives on my machine:

    Runtimes of PowerTOST's functions:
     2x2x2 ABE        :  2.22 ms
     2x2x4 ABEL  (EMA): 82.82 ms (37x slower)
     2x2x4 RSABE (FDA): 80.45 ms (36x slower)


    Detlew did a great job. The scaled power-functions are not by a factor of 105 slower – only ~40times. If you have a lot of time, go ahead and become famous.
    Furthermore, the intra-subject variabilities of test and reference must not be identical. Keep that in mind.


Edit: Easier with the package microbenchmark:

library(microbenchmark)
library(PowerTOST)
microbenchmark(
  power.TOST(CV=0.4,   theta0=0.9, design="2x2x2", n=134, method="nct"),
  power.scABEL(CV=0.4, theta0=0.9, design="2x2x4", n= 30, nsims=1e5),
  power.RSABE(CV=0.4,  theta0=0.9, design="2x2x4", n= 24, nsims=1e5))


On my machine (slow because I have 8 R-sessions running; CPU-load >95%):

Unit: milliseconds
          expr       min         lq      mean     median         uq       max neval
power.TOST()    3.703643   4.336665   5.82738   4.784337   5.475469  67.81569   100
power.scABEL() 84.496413  91.930565 120.51301  97.457502 156.742944 176.92144   100
power.RSABE()  91.391727 103.207520 139.51786 155.408528 167.997411 205.35759   100



    References:
  1. Chow S-C, Liu J-p. Design and Analysis of Bioavailability and Bioequivalence Studies. Boca Raton: Chapman & Hall/CRC; 3rd ed 2009: p. 598.
  2. Endrényi L, Tóthfalusi L. Regulatory conditions for the determination of bioequivalence of highly variable drugs. J Pharm Pharm Sci. 2009;12:138–49. [image] free resource.
  3. Wonnemann M, Frömke C, Koch A. Inflation of the Type I Error: Investigations on Regulatory Recommendations for Bioequivalence of Highly Variable Drugs. Pharm Res. 2015;32(1):135–43. doi:10.1007/s11095-014-1450-z.
  4. Schütz H. Two-stage designs in bioequivalence trials. Eur J Clin Pharmacol. 2015;71(3):271-81. doi:10.1007/s00228-015-1806-2.
  5. Fuglsang A. Controlling type I errors for two-stage bioequivalence study designs. Clin Res Reg Aff. 2011;28(4):100–5. doi:10.3109/10601333.2011.631547.
  6. Kieser M, Rauch G. Two-stage designs for cross-over bioequivalence trials. Stat Med. 2015;34(16):2403–16. doi:10.1002/sim.6487.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
nobody
nothing

2015-05-06 13:13
(3249 d 10:19 ago)

@ Helmut
Posting: # 14754
Views: 10,984
 

 TSDs for RSABE/ABEL

Who planned the one with the deficiency letter? :-D

...just asking


Οὐδείς

Kindest regards, nobody
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2015-05-06 13:55
(3249 d 09:36 ago)

@ nobody
Posting: # 14755
Views: 11,018
 

 TSDs for RSABE/ABEL

Γειά σου Οδυσσέα!

❝ Who planned the one with the deficiency letter? :-D


[image]

Last week during a coffee break at the EGA/EMA-workshop I prevented yet another one. The “design” was suggested by a ruthless, greedy incompetent [image] consultant. :no:

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
nobody
nothing

2015-05-06 14:26
(3249 d 09:05 ago)

@ Helmut
Posting: # 14757
Views: 10,890
 

 TSDs for RSABE/ABEL

...maybe this team

[image]

:o)

...but maybe someone solved alpha dilemma...

Kindest regards, nobody
ElMaestro
★★★

Denmark,
2015-05-06 14:31
(3249 d 09:00 ago)

@ Helmut
Posting: # 14758
Views: 10,924
 

 TSDs for RSABE/ABEL

Hi Helmut,

this was a great post, and really a necessary one. I hope it will make a difference.
It is the eternal opportunistic battle

a. "Noone has proven it doesn't work so we'll try it."

vs.

b. "So-And-So et al. showed that it works so we'll try it."

[and absence of evidence is not evidence of absence, perhaps??]
I have heard that there is going to be a symposium in Prague in two weeks where one of the true authorities on the issue of two-stage approaches (I am talking about a certain consultant from Vienna who also has a nasty habit of torturing statistical software) is a panelist; perhaps this guy will use the opportunity to express his such views. I certainly hope so as the message conveyed here really deserves to be wider known and could spare companies for much trouble.

Pass or fail!
ElMaestro
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2015-05-06 15:49
(3249 d 07:42 ago)

@ ElMaestro
Posting: # 14763
Views: 11,003
 

 TSDs for RSABE/ABEL

Hi ElMaestro,

❝ this was a great post, and really a necessary one.


THX.

❝ a. "Noone has proven it doesn't work so we'll try it."


Courageous (or just stupid?)

Primary concern in bioequivalence assessment is to limit the risk of erroneously accepting bio­equi­va­lence, i.e., maintain the patient’s risk (α) at ≤0.05.1,2 Only statistical procedures, which do not exceed the nominal risk of 5%, can be accepted.3,4


❝ I have heard that there is going to be a symposium in Prague in two weeks where one of the true authorities on the issue of two-stage approaches (I am talking about a certain consultant from Vienna who also has a nasty habit of torturing statistical software) is a panelist…


I have heard that another eminent consultant from Haderslev (having the habit of coding nasty software) will be a panelist as well.


    Oldies but goodies:
  1. McGilveray IJ, Midha KK, Skelly JP, Dighe S, Doluisio JT, French IW, Karim A, Burford R. Consensus Report from ‘Bio International ’89’: Issues in the Evaluation of Bioavailability Data. J Pharm Sci. 1990;79(10):945–6. doi 10.1002/jps.2600791022
  2. Melander H. Problems and Possibilities with the Add-On Subject Design. In: Midha KK, Blume HH. (eds). Bio-International. Bioavailability, Bioequivalence and Phar­ma­co­kinetics. Stuttgart: medpharm; 1992; p. 85–90.
  3. Commission of the European Communities. CPMP Guideline: Investigation of Bioavailability and Bioequivalence. Brussels 1991;111/54/89_EN. online
  4. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use. ICH Harmonised Tripartite Guideline: Statistical Principles for Clinical Trials. E9 (Step 4, 1998) online

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2015-05-08 22:47
(3247 d 00:44 ago)

(edited by d_labes on 2015-05-09 17:09)
@ Helmut
Posting: # 14778
Views: 10,732
 

 TSDs for RSABE/ABEL - run-time

Dear Helmut,

this is a very eminent important post :thumb up:.

But I believe those to whom it really concerns unfortunately will not read it or are not able to read or duno understand its message :no:.

❝ Contrary to 2×2×2 and parallel designs – where power can be directly calculated – for scaling we need simu­la­tions since the limits are not fixed, we have a restriction on the GMR, and an upper cap at 50% for EMA. The convergence is fine, but we still need 105 simulations within every simulated study. This leads to ~1014 (110,000,000,000,000‼) simulations overall.


This is an optimistic (sic!) estimation since you forgot the sims necessary for the sample size estimation step of 2 ... 8 iterations with 105 simulations each, since the start values of sample size search are naturally not as good as for the case of the classical sample size estimation.

So let us estimate more pessimistically (but may be too optimistically) an additional factor of 4 (one for the power monitoring step + 3 for the sample size estimation) within each of the simulations of your preferred grid of CV and n1.
This will then give, using your determined ~80 msecs, ~ 4x1014 sims x 80 ms =
3.2x 1013 secs = 533.333.333.333 min = 8.888.888.889 h = 370.370.370 days = 1.014.713 years!
calculated with the newest R version 3.2.0
(hope I got all the conversions using the thousands sparator "." correct, prove me wrong)

Oh what a weired calculation. Seems yesterday evening there was one beer too much :party:. And alcohol impairs the ability of self critics.
Try it once more, hope now I got it right:
1.1×109 sims each with 4x80 msecs for the power calculations =
5,866,667 min = 97,777.78 h = 4,074.074 days = 11.16185 years

Even if someone came out with a much more clever algo having a boost of 1E3 (thousand) faster than mine in PowerTOST (which I of course can't believe that it is possible to beat me to such an amount :cool:) we all will not be alive we have to wait more than one month (~41 days) for noticing the result for one shot of alpha adjustment. And you need more than one ...

Just to make your message crystal clear!

If this all is necessary to be really done for a certain regulatory body (which one remains my secret :not really:) remains an open question. As elsewhere said: Regulators are a bunch of strange people.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2015-05-10 03:11
(3245 d 20:20 ago)

@ d_labes
Posting: # 14779
Views: 10,667
 

 Really worth the efforts?

Dear Detlew,

❝ this is a very eminent important post :thumb up:.


THX. Was mainly a kind of self-defense. If such an idea pops up in the future, I would just link to this post.

❝ ❝ […] we still need 105 simulations within every simulated study. This leads to ~1014 (110,000,000,000,000‼) simulations overall.


❝ This is an optimistic (sic!) estimation […]



❝ So let us estimate more pessimistically (but may be too optimistically) an additional factor of 4 (one for the power monitoring step + 3 for the sample size estimation) within each of the simulations […]



At least for “Type 1” designs, you need only one step – as you ingeniously have implemented it in Power2Stage. An estimated total sample size ≤ n1 implies power ≥ target. ;-)

❝ Just to make your message crystal clear!


Yep. Whether it takes five or fifty years makes no difference in practice. It simply takes too long.

❝ If this all is necessary to be really done for a certain regulatory body (which one remains my secret :not really:) remains an open question. As elsewhere said: Regulators are a bunch of strange people.


I think that it is justified if regulators ask for demonstration that the TIE is maintained. That’s their job.
Sponsors should learn that Two-Stage Design are not the jack of all trades device (<span lang="de">eier­le­gen­de Woll­milch­sau</span>) as they inadvertently believe. TSDs are fine in ABE to deal with an uncertain CV. This is not necessary if we apply scaling – which inherently takes care of higher than expected variability. If a study is powered for ABE, everything should be OK (well, cough, inflation in ABEL/RSABE – another story).
Like in ABE the nasty thing is the ratio

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
UA Flag
Activity
 Admin contact
22,957 posts in 4,819 threads, 1,636 registered users;
66 visitors (0 registered, 66 guests [including 3 identified bots]).
Forum time: 22:32 CET (Europe/Vienna)

Nothing shows a lack of mathematical education more
than an overly precise calculation.    Carl Friedrich Gauß

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5