Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-11-25 10:50
(372 d 14:14 ago)

Posting: # 22084
Views: 3,822
 

 Inflated type I error with fixed widened limits? [RSABE / ABEL]

Dear all,

following this post I had a closer look at the GCC-GL (Version 2.4 of March 2016, page 26):

3.1.10 Highly variable drugs or drug products
Highly variable drug products (HVDP) are those whose intra-subject variability for a parameter is larger than 30%. If an applicant suspects that a drug product can be considered as highly variable in its rate and/or extent of absorption, a replicate cross-over design study can be carried out.
Those HVDP for which a wider difference in Cmax is considered clinically irrelevant based on a sound clinical justification can be assessed with a widened acceptance range. If this is the case, a wider acceptance range (i.e. 75–133%) for Cmax can be used. For the acceptance interval to be widened the bioequivalence study must be of a replicate design where it has been demonstrated that the within- subject variability for Cmax of the reference compound in the study is >30%. The applicant should justify that the calculated intra-subject variability is a reliable estimate and that it is not the result of outliers. The request for widened interval must be prospectively specified in the protocol.
The geometric mean ratio (GMR) should lie within the conventional acceptance range 80.00–125.00%.
The possibility to widen the acceptance criteria based on high intra-subject variability does not apply to AUC where the acceptance range should remain at 80.00 – 125.00% regardless of variability.
It is acceptable to apply either a 3-period or a 4-period crossover scheme in the replicate design study.


What happens if we pre-specify the widened acceptance range and discover in the study that CVwR ≤30? Assess the study for the conventional AR of 80.00–125.00%? IMHO, that means we have a data-driven decision – which might be false and result in an inflated type I error.

I performed simulations acc. to my understanding of the GL:

[image]


CVwR = CVwT = 30%, balanced 4-period 2-sequence full replicate design (TRTR|RTRT), n = 40 (81% power for θ0 = 0.90), 106 studies with θ0 = 1.25: ~20.6% passed…

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2020-11-25 15:07
(372 d 09:57 ago)

@ Helmut
Posting: # 22085
Views: 3,547
 

 Inflated type I error with fixed widened limits?

Dear Helmut,

» What happens if we pre-specify the widened acceptance range and discover in the study that CVwR ≤30? Assess the study for the conventional AR of 80.00–125.00%? IMHO, that means we have a data-driven decision – which might be false and result in an inflated type I error.

Correct.

» I performed simulations acc. to my understanding of the GL:
»

[image]


» CVwR = CVwT = 30%, balanced 4-period 2-sequence full replicate design (TRTR|RTRT), n = 40 (81% power for θ0 = 0.90), 106 studies with θ0 = 1.25: ~20.6% passed…

Confirmed! At least in magnitude.
My result: 20.46% passed.
Quick and dirty captured code from power.scABEL.sds.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-11-26 00:09
(372 d 00:55 ago)

@ d_labes
Posting: # 22086
Views: 3,519
 

 Inflated type I error with fixed widened limits?

Dear Detlew,

» » […] we have a data-driven decision – which might be false and result in an inflated type I error.
»
» Correct.

Shit.

» » I performed simulations acc. to my understanding of the GL: […] 106 studies with θ0 = 1.25: ~20.6% passed…

» Confirmed! At least in magnitude.
» My result: 20.46% passed.
» Quick and dirty captured code from power.scABEL.sds.

How, did you do that? That’s beyond me.

I misused your old subject sim code and added:

ow     <- options("digits")
nsims  <- 1e6
CVT    <- CVR <- 0.3
sWT    <- CV2se(CVT)
sWR    <- CV2se(CVR)
sBT    <- CV2se(CVT*2)
sBR    <- CV2se(CVR*2)
n      <- c(20, 20) # power ~0.81 for theta0 = 0.9
theta0 <- 1.25      # for TIE
options(digits = 12)
on.exit(options(ow))
set.seed(123456)
mvc    <- mean_vcov(c("TRTR", "RTRT"), muR = log(100), ldiff = log(theta0),
                      sWT = sWT, sWR = sWR, sBT = sBT, sBR = sBR, rho = 1)
res    <- data.frame(CVwR = rep(NA, nsims), PE = NA, lower = NA, upper = NA,
                     L = NA, U = NA, BE = FALSE)
pb     <- txtProgressBar(0, 1, 0, char="\u2588", width=NA, style=3)
st     <- proc.time()[[3]]
for (j in 1:nsims) {
  data        <- prep_data(seqs = c("TRTR", "RTRT"), n = n, metric = "PK",
                           dec = 5, mvc_list = mvc)
  data        <- data[, !(names(data) %in% c("seqno", "logval"))]
  cols        <- c("subject", "period", "sequence", "treatment")
  data[cols]  <- lapply(data[cols], factor)
  modBE       <- lm(log(PK) ~ sequence + subject + period + treatment,
                              data = data)
  res[j, 2:3] <- round(100*as.numeric(exp(confint(modBE, "treatmentT",
                                                  level = 1-2*0.05))), 2)
  res$PE[j]   <- round(100*as.numeric(exp(coef(modBE)[["treatmentT"]])), 2)
  modCVR      <- lm(log(PK) ~ sequence + subject + period,
                              data = data[data$treatment == "R", ])
  res$CVwR[j] <- mse2CV(anova(modCVR)["Residuals", "Mean Sq"])
  ifelse (res$CVwR[j] <= 0.3, res$L[j] <- 80, res$L[j] <- 75)
  res$U[j]    <- 100/(res$L[j]*0.01)
  if (res$lower[j] >= res$L[j] & res$upper[j] <= res$U[j] &
      res$PE[j] >= 80 & res$PE[j] <= 125) res$BE[j] <- TRUE
  setTxtProgressBar(pb, j/nsims)
}
et     <- proc.time()[[3]]
close(pb)
options(ow) # restore options
cat(paste0(j, " subj. simulations with theta0 = ", theta0, ": ",
           signif(sum(res$BE[!is.na(res$L)])/j, 5),
           " passed BE (empiric TIE)\nRuntime: ", signif((et-st)/60, 4),
           " minutes\n"))

And got 20.568% after eleven (!) hours.
PS: Only one of the sim’s failed on the PE restriction.
res[(res$lower >= res$L & res$upper <= res$U) & (res$PE <80 | res$PE >125), ]
            CVwR     PE  lower upper  L        U    BE
566300 0.3000039 125.03 117.36 133.2 75 133.3333 FALSE

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2020-11-26 15:38
(371 d 09:26 ago)

@ Helmut
Posting: # 22087
Views: 3,477
 

 Inflated type I error with fixed widened limits

Dear Helmut,

» » » […] we have a data-driven decision – which might be false and result in an inflated type I error.
» »
» » Correct.
»
» Shit.

Why shit?

» » » I performed simulations acc. to my understanding of the GL: […] 106 studies with θ0 = 1.25: ~20.6% passed…
»
» » Confirmed! At least in magnitude.
» » My result: 20.46% passed.
» » Quick and dirty captured code from power.scABEL.sds.
»
» How, did you do that? That’s beyond me.

Stolen the code of subject data sims and scaled ABE evaluation in the working horse function .pwr.SABE.sds().
Implementing the GCC rules instead of EMA rules or RSABE for FDA is simple.
Have a look into the code of the new but undocumented function power.fwl.sds() in the GitHub repository.

» ...And got 20.568% after eleven (!) hours.

Wow! 11 hours for one number! 106 sims in ca. 4-5 sec with my implementation.
Some results with setseed=FALSE:
power.fwl.sds(CV=0.3, n=40, theta0=0.8, design="2x2x4", setseed=F)
0.2045, 0.2066, 0.2047, 0.2051, 0.2082
Seems we are simulating the same number :cool:.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-11-26 17:14
(371 d 07:50 ago)

@ d_labes
Posting: # 22088
Views: 3,491
 

 Inflated type I error: Not nice

Dear Detlew,

» » Shit.
»
» Why shit?

Cause it slipped through my attention for almost fifteen years… In South Africa for Cmax fixed limits of 75.00–133.33% may be used (no replicate design needed).

» » » Quick and dirty captured code from power.scABEL.sds.
» »
» » How, did you do that? That’s beyond me.
»
» Stolen the code of subject data sims and scaled ABE evaluation in the working horse function .pwr.SABE.sds().
» Have a look into the code of the new but undocumented function power.fwl.sds() in the GitHub repository.

THX!

» » ...And got 20.568% after eleven (!) hours.
» Wow! 11 hours for one number! 106 sims in ca. 4-5 sec with my implementation.

Mühsam ernährt sich das Eichhörnchen.

» Some results with setseed=FALSE:
» power.fwl.sds(CV=0.3, n=40, theta0=0.8, design="2x2x4", setseed=F)
» 0.2045, 0.2066, 0.2047, 0.2051, 0.2082
» Seems we are simulating the same number :cool:.

:-D

Terrible for drugs which are not highly variable (reminds me on the FDA’s RSABE). The design is less important.

[image] 2-sequence 4-period (full) replicates

[image] 2-sequence 3-period (full) replicates

[image] 3-sequence 3-period (partial) replicates


Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2020-11-28 11:20
(369 d 13:44 ago)

@ Helmut
Posting: # 22091
Views: 3,328
 

 Power of the GCC framework and power of PowerTOST

Dear Helmut,

» Stolen the code of subject data sims and scaled ABE evaluation in the working horse function .pwr.SABE.sds().
» Have a look into the code of the new but undocumented function power.fwl.sds() in the GitHub repository.

there is no need to define and program a new function :cool:!
Try the following:
# define a new regulator object
regGCC <- reg_const("user", r_const= log(1/0.75)/CV2se(0.3), CVswitch = 0.3, CVcap = 0.3,
                    pe_const=TRUE)
regGCC
USER defined regulatory settings
- CVswitch            = 0.3
- cap on scABEL if CVw(R) > 0.3
- regulatory constant = 0.9799758
- pe constraint applied
# verify if the limits are what they should be
scABEL(CV=0.2, regulator = regGCC)
lower upper
 0.80  1.25
scABEL(CV=0.3, regulator = regGCC)
lower upper
 0.80  1.25
scABEL(CV=0.31, regulator = regGCC)
   lower    upper
0.750000 1.333333
scABEL(CV=0.35, regulator = regGCC)
   lower    upper
0.750000 1.333333
# use the function already in the package
power.scABEL.sds(CV=0.3, n=40, design="2x2x4", regulator = regGCC, theta0 = 1.25)
[1] 0.20463

Result matches the previous calculations.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-11-29 11:21
(368 d 13:43 ago)

@ d_labes
Posting: # 22092
Views: 3,277
 

 Sample sizes (ignoring the inflated TIE)

Dear Detlew,

» there is no need to define and program a new function :cool:!

Great!

library(PowerTOST)
design <- "2x2x4"
regGCC <- reg_const("user", r_const = log(1/0.75)/CV2se(0.3),
                    CVswitch = 0.3, CVcap = 0.3, pe_const = TRUE)
CV     <- sort(c(seq(0.2, 0.6, 0.1), 0.301))
res    <- data.frame(CV = CV, L.EMA = scABEL(CV, regulator = "EMA")[, 1],
                     U.EMA = scABEL(CV, regulator = "EMA")[, 2], n.EMA = NA,
                     L.GCC = scABEL(CV, regulator = regGCC)[, 1],
                     U.GCC = scABEL(CV, regulator = regGCC)[, 2], n.GCC = NA)
for (j in 1:nrow(res)) {
  res$n.EMA[j] <- sampleN.scABEL(CV = CV[j], design = design, regulator = "EMA",
                                 details = FALSE, print = FALSE)[["Sample size"]]
  res$n.GCC[j] <- sampleN.scABEL(CV = CV[j], design = design, regulator = regGCC,
                                 details = FALSE, print = FALSE)[["Sample size"]]
}
print(res, row.names = FALSE)

    CV     L.EMA    U.EMA n.EMA L.GCC    U.GCC n.GCC
 0.200 0.8000000 1.250000    18  0.80 1.250000    18
 0.300 0.8000000 1.250000    34  0.75 1.333333    28
 0.301 0.7994604 1.250844    34  0.75 1.333333    28
 0.400 0.7461770 1.340165    30  0.75 1.333333    30
 0.500 0.6983678 1.431910    28  0.75 1.333333    42
 0.600 0.6983678 1.431910    32  0.75 1.333333    58


PS: Do we have a bug in scABEL()?

CV  <- c(0.3, 0.3+10^(-6:-2))
lim <- scABEL(CV)
res <- data.frame(CV = CV, L = lim[, 1], U = lim[, 2])
print(res, row.names = FALSE)

       CV         L        U
 0.300000 0.8000000 1.250000
 0.300001 0.8000296 1.249954
 0.300010 0.8000244 1.249962
 0.300100 0.7999731 1.250042
 0.301000 0.7994604 1.250844
 0.310000 0.7943616 1.258873


Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2020-11-29 16:22
(368 d 08:42 ago)

(edited by d_labes on 2020-11-29 17:57)
@ Helmut
Posting: # 22094
Views: 3,251
 

 Bug?

Dear Helmut,

» PS: Do we have a bug in scABEL()?
» CV  <- c(0.3, 0.3+10^(-6:-2))
» lim <- scABEL(CV)
» res <- data.frame(CV = CV, L = lim[, 1], U = lim[, 2])
» print(res, row.names = FALSE)
»
»       CV         L        U
» 0.300000 0.8000000 1.250000
» 0.300001 0.8000296 1.249954
» 0.300010 0.8000244 1.249962
» 0.300100 0.7999731 1.250042
» 0.301000 0.7994604 1.250844
» 0.310000 0.7943616 1.258873

Not a bug :no:, it's a feature :cool:!
Reason: r_const=0.76.
With r_const <- log(1.25)/CV2se(0.3) = 0.7601283... we get
       CV         L        U
 0.300000 0.8000000 1.250000
 0.300001 0.7999994 1.250001
 0.300010 0.7999943 1.250009
 0.300100 0.7999430 1.250089
 0.301000 0.7994302 1.250891
 0.310000 0.7943307 1.258922

Satisfied?

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-11-30 00:16
(368 d 00:48 ago)

@ d_labes
Posting: # 22097
Views: 3,218
 

 “Feature”

Dear Detlew,

» Not a bug :no:, it's a feature :cool:!
» Reason: r_const=0.76.
» With r_const <- log(1.25)/CV2se(0.3) = 0.7601283... we get
»       CV         L        U
» 0.300000 0.8000000 1.250000
» 0.300001 0.7999994 1.250001
» 0.300010 0.7999943 1.250009
» 0.300100 0.7999430 1.250089
» 0.301000 0.7994302 1.250891
» 0.310000 0.7943307 1.258922
» Satisfied?

Fuck! I already guessed that. I hate “nice numbers”.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
wienui
★    

Germany, Oman,
2020-11-29 14:03
(368 d 11:01 ago)

@ Helmut
Posting: # 22093
Views: 3,272
 

 Inflated type I error with fixed widened limits?

Hi Helmut & Detlew,

very interesting.
could you please kindly enlighten it more for not great statistician like I, and how could this type I error inflation happen?

Thanks in advance

Cheers,
Osama
d_labes
★★★

Berlin, Germany,
2020-11-29 17:51
(368 d 07:13 ago)

@ wienui
Posting: # 22095
Views: 3,261
 

 Inflated type I error with fixed widened limits

Dear Osama,

» ... how could this type I error inflation happen?

Imagine, you have a true CVwR of 30%. That means your true BE limits should be 80% ... 125%.
Now you performe a study but obtain an estimate of CVwR of 35%. Given this you use the widened (fixed) BE acceptance range of 75% ... 133.33%. With that acceptance range the BE decision is easier to obtain. This also happens if the true theta0 (GMR) is 125% (or 80%), i.e. bioinequivalence is what we have. Thus more than 5% of BE decision result, that means there is an alpha inflation.

Hope this makes sense to you.
May be Helmut has a better (paedogogical) explanation because he is an exceptionally gifted preacher :cool:.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-11-30 00:14
(368 d 00:50 ago)

@ d_labes
Posting: # 22096
Views: 3,237
 

 Inflated type I error with fixed widened limits

Dear Detlew & Osama,

» » ... how could this type I error inflation happen?

» May be Helmut has a better (paedogogical) explanation because he is an exceptionally gifted preacher :cool:.

Only you say so. ;-) I’ll try.

Let’s start with two power curves (first [image]-script at the end). The 2-sequence 4-period replicate study is designed for ABE with conventional limits, 80% power. Since we are wary, we assume a \(\small{\theta_0}\) (T/R-ratio) of 0.90 and estimate a sample size of 40. One power curve for the conventional limits as planned (──) and one for the widened limits (──).

[image]

The probability of a type I error (i.e., the consumer risk) to be treated with a formulation which is not BE (true \(\small{\theta_0}\) outside the limits) is ≤5%. You see that the blue power curve intersects 0.05 at \(\small{\theta_0=0.80}\) and \(\small{\theta_0=1.25}\). That means also the chance of falsely passing BE is 5%. The same is applicable to South Africa’s approach where we could use pre-specified widened limits independent from the observed CV. Then we have a TIE of 0.05 as well (the red power curve intersects 0.05 at \(\small{\theta_0=0.75}\) and \(\small{\theta_0=1.\dot{3}}\)).

Imagine an even more extreme case than the one Detlew mentioned. We observe in the study a CVwR of 30.01% although the true one is 30%. That means we will use the widened limits of 75.00–133.33% instead of the correct 80.00–125.00%. With the same \(\small{\theta_0=0.80}\) and \(\small{\theta_0=1.25}\) we jump to the red power curve and therefore, have a much higher chance of passing (~40% instead of 5%).

Now for the tricky part (sorry): In the previous posts we estimated an inflation of the TIE of ~20%. Why not 40%? Say, the true CVwR is 30% and acc. to the convention the drug is considered to be not highly variable. Hence, we would have to apply the conventional limits. There is a ~50% chance that in the actual study we will observe a CVwR >30% and misclassify the drug/formulation as highly variable and apply the widened limits. But there is also a ~50% chance that we observe a CVwR ≤30% and use the conventional limits. Both together gives the ~20% inflation (actually it is more complicated*).

What if we estimate the sample size for the GCC’s approach? With 28 it will be lower (second [image]-script).

[image]

Since with a CV of 30% we have to use the conventional limits (power at \(\small{\theta_0=0.80}\) and \(\small{\theta_0=1.25}\) will be ~0.16) and we felt into the trap of an inflated type I error. Note that the TIE depends also on the sample size. Hence, it will be smaller than with 40 subjects.

If you think about iteratively adjusted α like for the reference-scaling methods (third [image]-script) – examples for the 2x2x4 design (sample sizes estimated for the GCC’s approach):

CV = 0.150, n = 12 (power = 0.8496)
TIE = 0.050148: inflated (α-adjustment necessary).
 2 iterations: adjusted α = 0.049854 (90.029% CI), TIE = 0.05 (power = 0.8493).
CV = 0.200, n = 18 (power = 0.8015)
TIE = 0.050910: inflated (α-adjustment necessary).
 7 iterations: adjusted α = 0.049082 (90.184% CI), TIE = 0.05 (power = 0.7988).
CV = 0.250, n = 28 (power = 0.8220)
TIE = 0.072558: inflated (α-adjustment necessary).
 5 iterations: adjusted α = 0.032075 (93.585% CI), TIE = 0.05 (power = 0.7644).
CV = 0.300, n = 28 (power = 0.8079)
TIE = 0.164829: inflated (α-adjustment necessary).
 8 iterations: adjusted α = 0.009071 (98.186% CI), TIE = 0.05 (power = 0.5859).
CV = 0.301, n = 28 (power = 0.8085)
TIE = 0.022390: not inflated (no α-adjustment necessary).


However, we could face a substantial loss in power (for CV 30% and the adjusted α of ~0.0091 it would drop from 81% to 59%).

[image]



1. Power-curves for fixed limits

library(PowerTOST)
design <- "2x2x4" # any replicate - see known.designs()
CV     <- 0.30
target <- 0.80
theta0 <- 0.90
n      <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target,
                       design = design, details = FALSE,
                       print = FALSE)[["Sample size"]]
theta0 <- c(theta0, seq(0.75*0.96, 1, length.out = 60))
theta0 <- sort(unique(c(theta0, 0.8, 1, 1.25, 1/theta0)))
res    <- data.frame(theta0 = theta0)
for (j in seq_along(theta0)) {
  res$pwr1[j] <- power.TOST(CV = CV, n = n, theta0 = theta0[j],
                            theta1 = 0.80, design = design)
  res$pwr2[j] <- power.TOST(CV = CV, n = n, theta0 = theta0[j],
                            theta1 = 0.75, design = design)
}
if (is.null(attr(dev.list(), "names")))
  windows(width = 4.5, height = 3.3, record = TRUE)
op  <- par(no.readonly = TRUE)
par(mar = c(4, 4, 0, 0) + 0.1, cex.axis = 0.8)
plot(log(theta0), res$pwr1, type = "n", axes = FALSE, yaxs = "i",
     xlim = log(c(0.75, 1/0.75)), ylim = c(0, 1.04),
     xlab = expression(theta[0]), ylab = "power")
axis(2, las = 1)
axis(2, at = 0.05, las = 1)
axis(1, at = log(c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)),
     labels = sprintf("%.2f", c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)))
abline(h = pretty(c(0, 1)), lty = 3, col = "lightgrey")
abline(h = 0.05, lty = 2)
abline(v = log(c(0.9, 1, 1.1, 1.2)), lty = 3, col = "lightgrey")
abline(v = log(c(0.75, 0.8, 1.25, 1/0.75)), lty = 2)
box()
lines(log(theta0), res$pwr1, lwd = 3, col = "blue")
lines(log(theta0), res$pwr2, lwd = 3, col = "red")
par(op)


2. Power curves for sample sizes acc. to the GCC’s approach

library(PowerTOST)
design <- "2x2x4" # only "2x2x4", "2x2x3", "2x3x3" implemented!
CV     <- 0.30
target <- 0.80
theta0 <- 0.90
reg    <- reg_const("user", r_const = log(1/0.75)/CV2se(0.3),
                    CVswitch = 0.3, CVcap = 0.3, pe_const = TRUE)
n      <- sampleN.scABEL(CV = CV, theta0 = theta0, targetpower = target,
                         design = design, regulator = reg, details = FALSE,
                         print = FALSE)[["Sample size"]]
theta0 <- c(theta0, seq(0.75*0.96, 1, length.out = 60))
theta0 <- sort(unique(c(theta0, 0.8, 1, 1.25, 1/theta0)))
res    <- data.frame(theta0 = theta0)
for (j in seq_along(theta0)) {
  res$pwr[j] <- power.scABEL(CV = CV, n = n, theta0 = theta0[j],
                             theta1 = 0.80, design = design,
                             regulator = reg)
}
if (is.null(attr(dev.list(), "names")))
  windows(width = 4.5, height = 3.3, record = TRUE)
op  <- par(no.readonly = TRUE)
par(mar = c(4, 4, 0, 0) + 0.1, cex.axis = 0.8)
plot(log(theta0), res$pwr, type = "n", axes = FALSE, yaxs = "i",
     xlim = log(c(0.75, 1/0.75)), ylim = c(0, 1.04),
     xlab = expression(theta[0]), ylab = "power")
axis(2, las = 1)
axis(2, at = 0.05, las = 1)
axis(1, at = log(c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)),
     labels = sprintf("%.2f", c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)))
abline(h = pretty(c(0, 1)), lty = 3, col = "lightgrey")
abline(h = 0.05, lty = 2)
abline(v = log(c(0.9, 1, 1.1, 1.2)), lty = 3, col = "lightgrey")
abline(v = log(c(0.8, 1.25)), lty = 2)
if (CV > 0.3) abline(v = log(c(0.75, 1/0.75)), lty = 2)
box()
lines(log(theta0), res$pwr, lwd = 3, col = "blue")
par(op)


3. Iteratively adjusted α for the GCC’s approach

library(PowerTOST)
opt <- function(x) {
  if (sdsims) {
    power.scABEL.sds(alpha = x, CV = CV, n = n, theta0 = U, design = design,
                     regulator = reg, nsims = 1e6, setseed = setseed,
                     progress = FALSE) - alpha
  } else {
    power.scABEL(alpha = x, CV = CV, n = n, theta0 = U, design = design,
                 regulator = reg, nsims = 1e6, setseed = setseed) - alpha
  }
}
design  <- "2x2x4" # only "2x2x4", "2x2x3", "2x3x3" implemented!
alpha   <- 0.05    # nominal level of the test
CV      <- 0.30    # can be a 2-element vector: CV[1] for T, CV[2] for R
theta0  <- 0.90    # assumed T/R-ratio
target  <- 0.80    # target (desired) power
setseed <- TRUE    # for reproducibility
sdsims  <- FALSE   # set to TRUE for partial replicate (much slower)
reg     <- reg_const("user", r_const = log(1/0.75)/CV2se(0.3),
                     CVswitch = 0.3, CVcap = 0.3, pe_const = TRUE)
# power and sample size for the GCC’s approach
x       <- sampleN.scABEL(CV = CV, theta0 = theta0, design = design,
                          targetpower = target, regulator = reg,
                          details = FALSE, print = FALSE)
power   <- x[["Achieved power"]]
n       <- x[["Sample size"]]
U       <- scABEL(CV = CV, regulator = reg)[["upper"]]
power   <- power.TOST(CV = CV, n = n, theta0 = theta0, design = design)
if (!sdsims) {     # simulate underlying statistics
  TIE <- power.scABEL(CV = CV, n = n, theta0 = U, design = design,
                      regulator = reg, nsims = 1e6)
} else {           # simulate subject data
  TIE <- power.scABEL.sds(CV = CV, n = n, theta0 = U, design = design,
                          regulator = reg, nsims = 1e6, progress = FALSE)
}
txt <- paste0("CV = ", sprintf("%.3f", CV), ", n = ", n,
              " (power = ", sprintf("%.4f)", power))
if (TIE <= alpha) {
  txt <- paste0(txt, "\nTIE = ", sprintf("%.6f", TIE), ": not inflated ",
                "(no \u03B1-adjustment necessary).\n")
} else {
  txt <- paste0(txt, "\nTIE = ", sprintf("%.6f", TIE), ": inflated ",
                "(\u03B1-adjustment necessary).")
  x   <- uniroot(opt, interval = c(0, alpha), tol = 1e-8)
  alpha.adj <- x$root
  TIE.adj   <- alpha + x$f.root
  power.adj <- power.TOST(alpha = alpha.adj, CV = CV, n = n, theta0 = theta0,
                          design = design)
  txt <- paste0(txt, "\n", sprintf("%2i", x$iter), " iterations: ",
                "adjusted \u03B1 = ", sprintf("%.6f", alpha.adj), " (",
                sprintf("%.3f%%", 100*(1-2*alpha.adj)), " CI), TIE = ",
                TIE.adj, " (power = ", sprintf("%.4f)", power.adj), ".\n")
}
cat(txt)



  • \(\small{\theta_0}\) follows a lognormal distribution and \(\small{CV_\textrm{wR}}\) a \(\small{\chi^2}\) distribution. Both distributions are not symmetric but skewed to the right:
    [image]
    Hence, at a true \(\small{\theta_0}\) of 1.25 and a true \(\small{CV_\textrm{wR}}\) of 30% in a particular study the chance of a classify the drug falsely as highly variable (based on the observed \(\small{CV_\textrm{wR}}\)) and proceed with scaling is slightly higher than 50%.

Edit: We implemented regulator = "GCC" in version 1.5-3 (2021-01-18) of PowerTOST. Example:

library(PowerTOST)
design  <- "2x2x4
CV      <- 0.30
theta0  <- 0.90
target  <- 0.80
n       <- sampleN.scABEL(CV = CV, theta0 = theta0, design = design,
                          targetpower = target, regulator = "GCC",
                          details = FALSE, print = FALSE)[["Sample size"]]
scABEL.ad(CV = CV, theta0 = theta0, design = design, n = n,
          regulator = "GCC", details = TRUE)

+++++++++++ scaled (widened) ABEL ++++++++++++
         iteratively adjusted alpha
   (simulations based on ANOVA evaluation)
----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1,000,000 studies in each iteration simulated.

CVwR 0.3, CVwT 0.3, n(i) 14|14 (N 28)
Nominal alpha                 : 0.05
True ratio                    : 0.9000
Regulatory settings           : GCC (ABE)
Switching CVwR                : 0.3
BE limits                     : 0.8000 ... 1.2500
PE constraints                : 0.8000 ... 1.2500
Empiric TIE for alpha 0.0500  : 0.16483 (rel. change of risk: +230%)
Power for theta0 0.9000       : 0.808
Iteratively adjusted alpha    : 0.00907
Empiric TIE for adjusted alpha: 0.05000
Power for theta0 0.9000       : 0.586 (rel. impact: -27.5%)

Runtime    : 6.6 seconds
Simulations: 8,100,000 (7 iterations)

The impact on power is massive. Which sample size would we need to maintain the target power?

sampleN.scABEL.ad(CV = CV, theta0 = theta0, design = design,
                  targetpower = target, regulator = "GCC",
                  details = TRUE)

+++++++++++ scaled (widened) ABEL ++++++++++++
            Sample size estimation
        for iteratively adjusted alpha
   (simulations based on ANOVA evaluation)
----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1,000,000 studies in each iteration simulated.

Assumed CVwR 0.3, CVwT 0.3
Nominal alpha      : 0.05
True ratio         : 0.9000
Target power       : 0.8
Regulatory settings: GCC (ABE)
Switching CVwR     : 0.3
BE limits          : 0.8000 ... 1.2500
PE constraints     : 0.8000 ... 1.2500

n  28, nomin. alpha: 0.05000 (power 0.8079), TIE: 0.1648

Sample size search and iteratively adjusting alpha
n  28,   adj. alpha: 0.00907 (power 0.5859), rel. impact on power: -27.48%
n  48,   adj. alpha: 0.00343 (power 0.7237)
n  46,   adj. alpha: 0.00376 (power 0.7136)
n  48,   adj. alpha: 0.00343 (power 0.7237)
n  50,   adj. alpha: 0.00313 (power 0.7330)
n  52,   adj. alpha: 0.00283 (power 0.7402)
n  54,   adj. alpha: 0.00258 (power 0.7490)
n  56,   adj. alpha: 0.00233 (power 0.7554)
n  58,   adj. alpha: 0.00215 (power 0.7641)
n  60,   adj. alpha: 0.00198 (power 0.7703)
n  62,   adj. alpha: 0.00180 (power 0.7789)
n  64,   adj. alpha: 0.00164 (power 0.7851)
n  66,   adj. alpha: 0.00152 (power 0.7909)
n  68,   adj. alpha: 0.00138 (power 0.7958)
n  70,   adj. alpha: 0.00126 (power 0.8010), TIE: 0.05000
Compared to nominal alpha's sample size increase of 150.0% (~study costs).

Runtime    : 96.4 seconds
Simulations: 120,700,000

Oops! Since the TIE depends on the sample size itself (see the plots in this post), we have to adjust more.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
wienui
★    

Germany, Oman,
2020-11-30 03:30
(367 d 21:34 ago)

@ Helmut
Posting: # 22098
Views: 3,192
 

 Inflated type I error with fixed widened limits

Dear Detlew & Helmut,

Thank you for the brilliant explanation.

» » May be Helmut has a better (paedogogical) explanation because he is an exceptionally gifted preacher :cool:.
» Only you say so. ;-) I’ll try.

Kein Wunder, dass Ihr beiden den Titel eines grossen Prediger verdient.

Cheers,
Osama
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-11-30 14:42
(367 d 10:21 ago)

@ wienui
Posting: # 22099
Views: 3,107
 

 Houston, we have a problem!

Dear Osama,

» Thank you for the brilliant explanation.

You are welcome! In the meantime I added more stuff to my post.
If my interpretation of the GL is correct (is it?) and applied as such by members of the GCC, we have a problem if the CVwR observed in the study is ≤30%.

What could be done?
  • Regulatory side
    • If high variability is suspected by the applicant, allow pre-specified wider limits for Cmax irrespective of the observed CV – like currently in South Africa and acceptable by the EMEA prior to 2006.1 Does not even require a replicate design.
    • Implement ABEL instead. In line not only with the EMA but many other jurisdictions (the WHO, ASEAN States, Australia, Brazil, Canada, Chile, the East African Community, Egypt, the Eurasian Economic Union, New Zealand, the Russian Federation). A step towards glo­bal harmonization. Lower inflation of the type I error if CVwR ≤30% than with the current approach. However, inflation of the TIE also if CVwR >30% (up to ~45%), whereas there is none in the current approach. Many publications dealing with the issue; iteratively adjusting α is provided by PowerTOST’s function scABEL.ad(). Sample size estimation to compensate for the potential loss of power is provided by the function sampleN.scABEL.ad().
    • Implementing RSABE (USA, China) would be no good idea. Nasty inflation of the TIE if CVwR <30% as well…
  • Applicant’s side
    • Ask the authority whether ABEL is an acceptable alternative to the current approach. Is it already?
    • If not, adjust α with my [image]-script. Be aware of the potential loss in power!
      Maybe (‼) I will implement it in scABEL.ad() and sampleN.scABEL.ad(). No promises.
      Edit: See this post.
  • Utopia
    • Within the last ten years many replicate studies were performed. Hence, we simply know a good number of drugs / drug products which are highly variable and pose no safety concerns. Sometimes entire classes of drugs are highly variable (e.g., proton-pump inhi­bi­tors). Agencies could simply recommend widened limits in product-specific guidelines. No clinical justification2 needed by applicants, no replicate design needed, no issues with inflation of the TIE. Sigh.

  1. In the EU lots of accepted studies with 75.00–133.33%. Prior to 2001 limits of 70–143% were not uncommon for Cmax. Sometimes even for AUC…
  2. An often overlooked detail. Regularly difficult to provide for generic companies with no access to the originator’s data. Generally just a lot of :blahblah: in the protocol.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Astea
★★  

Russia,
2020-12-01 23:34
(366 d 01:30 ago)

@ Helmut
Posting: # 22100
Views: 2,886
 

 Paradox of tolerance

Dear Preachers!

You've discovered truly a very interesting feature!
But I have some doubts in logical equality of the inflation of TIE and consumer's risk. Can you please explain my faults in the following reasoning?

Suppose we expect drug A to be highly variable (in the previous study somewhere in Antarctica W. Oodendijk et al. have got CV>30% for the reference drug). Which of the following options should we prefer to write in the protocol in order to care of the customer:

a). Use pre-specified wider limits 75-133 for Cmax (no inflation?)
b). Use the GCC-GL approach (inflation up to 21%?)

Suppose that at the end of the trial we get CV≤30% and CI within 75-133, but out of 80-125.
Then for the a-approach we should conclude the drug BE, for the b-approach - fail to conclude BE.
That is the risk of the customer to get a bad product is higher in the first approach if we define "a bad product" as a non-HVD with the limits out of 80-125.
The difference is in the fact that in the first approach we proclaim the drug to be good if it is within the limits 75-133.

Until about 2013 there were a lot of studies in Russia with 75-133 limits for Cmax even for non-HVD drugs.

"Being in minority, even a minority of one, did not make you mad"
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-12-02 01:37
(365 d 23:27 ago)

@ Astea
Posting: # 22101
Views: 2,879
 

 Οὐτοπεία ∨ Εὐτοπεία

Hi Nastia

» […] I have some doubts in logical equality of the inflation of TIE and consumer's risk. Can you please explain my faults in the following reasoning?
»
» Suppose we expect drug A to be highly variable (in the previous study somewhere in Antarctica W. Oodendijk et al. have got CV>30% for the reference drug).

The problem starts already here. How reliable is Oodendijk’s result? Is it the only one? Did the agency agree that the drug is HV and wider limits can be used?

» Which of the following options should we prefer to write in the protocol in order to care of the customer:
»
» a). Use pre-specified wider limits 75-133 for Cmax (no inflation?)
» b). Use the GCC-GL approach (inflation up to 21%?)

The crucial point is what we consider a “clinically not relevant \(\small{\Delta}\)”. Muse a bit on these goodies:
$$\small{\Delta=20\%\implies\left\{\theta_1=80.00\%,\,\theta_2=125.00\%\right\}}\tag{1}$$ $$\small{\Delta=25\%\implies\left\{\theta_1=75.00\%,\,\theta_2=133.3\dot{3}\%\right\}}\tag{a}$$ $$\small{\Delta\: \overset{{\color{Red} ?}}{\rightarrow}\,\begin{vmatrix}
\widehat{CV_\textrm{wR}}\leq30\%\rightarrow \Delta=20\%\\
\widehat{CV_\textrm{wR}}>30\%\rightarrow \Delta=25\%
\end{vmatrix}\implies\begin{Bmatrix}
\theta_1=80.00\%,\,\theta_2=125.00\%\\
\theta_1=75.00\%,\,\theta_2=133.3\dot{3}\%
\end{Bmatrix}}\tag{b}$$ \(\small{(1)}\) and \(\small{(\textrm{a})}\) are straightforward. Fixed limits, type I error always ≤ the nominal \(\small{\alpha}\).
\(\small{(\textrm{b})}\) is data-driven (like ABEL and RSABE), since it depends on the estimated \(\small{CV_\textrm{wR}}\). The Null-hypothesis is like Schrödinger’s cat – or Wigner’s friend, if you prefer. The study (not based on clinical grounds by the applicant and regulator like in \(\small{(1)}\) and \(\small{(\textrm{a})}\)) “decides” which \(\small{\Delta}\) is acceptable for the patient. That’s not a particularly good idea. By definition (‼) any framework (or a pre-test) might lead to a false decision and hence, inflates the TIE. That’s a multiplicity problem, which – if not adjusted – will increase the familywise error rate.

» Suppose that at the end of the trial we get CV≤30% and CI within 75-133, but out of 80-125.
» Then for the a-approach we should conclude the drug BE, …

If (a) was stated in the protocol and accepted by the agency, fine. The CV is interesting though not relevant. Try the function CVCL() in PowerTOST. Might be pure chance (well include >30%).

» … for the b-approach - fail to conclude BE.

No risk, no fun.

» That is the risk of the customer to get a bad product is higher in the first approach if we define "a bad product" as a non-HVD with the limits out of 80-125.

Nope. In (a) you accept beforehand that \(\small{\Delta=25\%}\) is not relevant for the patient. But again: You don’t assess the CV at all. Maybe it is HV indeed (like in Antarctica).

» The difference is in the fact that in the first approach we proclaim the drug to be good if it is within the limits 75-133.

Correct.

» Until about 2013 there were a lot of studies in Russia with 75-133 limits for Cmax even for non-HVD drugs.

Interesting. However:

library(PowerTOST)
CVCL(CV = 0.27, df = 3*40-4, side ="2-sided") # 4-period full replicate, n = 40

 lower CL  upper CL
0.2383666 0.3116013


Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Astea
★★  

Russia,
2020-12-02 09:46
(365 d 15:18 ago)

@ Helmut
Posting: # 22102
Views: 2,833
 

 Uchronia

Dear Helmut!

Thank you for the prompt answer!

» The problem starts already here. How reliable is Oodendijk’s result? Is it the only one?

The reliability of someone else's data - that is the question (especially when one of the authors says "waouf" :-)

» The crucial point is what we consider a “clinically not relevant \(\small{\Delta}\)”

As far as we (and the Agency) proclaim 25% to be clinically not relevant there is no difference in the rate of the harm for the customer's health independently from the a- or b- approach. For the b-approach he'll just receive not worser drug or doesn't receive it at all.

» Try the function CVCL() in PowerTOST.

I try:

library(PowerTOST)
CVCL(CV = 0.3, df = 3*40-4, side ="2-sided")
 lower CL  upper CL
0.2646219 0.3466708


As CI is shifted to the right - does it mean that for these initial conditions the probability of the conclusion of HV is higher?
(By the way shouldn't we lower the degrees of freedom for the CV of the reference drug? 3*40-3 should correspond to the common CV of the Test and Reference, shouldn't it?)

"Being in minority, even a minority of one, did not make you mad"
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-12-02 11:34
(365 d 13:30 ago)

@ Astea
Posting: # 22103
Views: 2,816
 

 Steampunk

Hi Nastia,

» » The problem starts already here. How reliable is Oodendijk’s result? Is it the only one?
»
» The reliability of someone else's data - that is the question (especially when one of the authors says "waouf" :-)

Willard Oodendijk twittered and Nemo Macron said “waouf”.

» » The crucial point is what we consider a “clinically not relevant \(\small{\Delta}\)”
»
» As far as we (and the Agency) proclaim 25% to be clinically not relevant there is no difference in the rate of the harm for the customer's health independently from the a- or b- approach. For the b-approach he'll just receive not worser drug or doesn't receive it at all.

Here you err. In (a) all is good. In (b) everything is in a flux; the applicant and agency agree only that the acceptable risk may be either 20% or 25%.
We are dealing with average BE. Classifying HVD(P)s based on CVwR is fine in principle. However, once we make this classification post hoc (based on \(\small{\widehat{CV_\textrm{wR}}}\)), troubles start. Hence, I don’t like* the reference-scaling methods and (b) as well.

» » Try the function CVCL() in PowerTOST.
» I try:
»

library(PowerTOST)
» CVCL(CV = 0.3, df = 3*40-4, side ="2-sided")
»  lower CL  upper CL
» 0.2646219 0.3466708


» As CI is shifted to the right …

Skewed to the right because the variance follows a \(\small{\chi^2}\)-distribution.

» … does it mean that for these initial conditions the probability of the conclusion of HV is higher?

Yes (for any condition).

» (By the way shouldn't we lower the degrees of freedom for the CV of the reference drug? 3*40-3 should correspond to the common CV of the Test and Reference, shouldn't it?)

Oops, one more degree of freedom! In the 2-sequence 4-period replicate design we have df = 3n – 4 for the pooled CVw. Following the EMA’s model for the estimation of CVwR we have one factor (the treatment) less in the model and therefore, df = 3n – 3:

library(PowerTOST)
CVCL(CV = 0.3, df = 3*40-3, side = "2-sided")
 lower CL  upper CL
0.2647549 0.3464397



  • Not for an initiate like you but others:
    • Such a study is not [image] bijective like when assessed for ABE. Whereas in ABE we could reverse the procedure (if T ≈ R also R ≈ T), this is highly unlikely here (only if CVwR ≡ CVwT).
    • In ABE every application has to follow the same rules and \(\small{\Delta}\) is known. Here every study sets its own rule. The BE-limits and hence, \(\small{\Delta}\) are random variables. Without access to the study report patients and physicians don’t know the risk.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Astea
★★  

Russia,
2020-12-02 15:28
(365 d 09:36 ago)

@ Helmut
Posting: # 22104
Views: 2,746
 

 Dieselpunk

Dear Helmut!

Thank you for the explanation!

» Here you err. In (a) all is good. In (b) everything is in a flux; the applicant and agency agree only that the acceptable risk may be either 20% or 25%.

I just meant to say that the acceptable risk of either 20% or 25% is anyway less or equal to 25%.

» Oops, one more degree of freedom! In the 2-sequence 4-period replicate design we have df = 3n – 4 for the pooled CVw. Following the EMA’s model for the estimation of CVwR we have one factor (the treatment) less in the model and therefore, df = 3n – 3:

How does this df correspond to the residual df of ANOVA for getting CVWR? I thought that there should be only
40-2=38 degrees of freedom - because from the point of view of the reference drug the full replicate turns to standart 2-way, is it right?

"Being in minority, even a minority of one, did not make you mad"
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-12-02 16:18
(365 d 08:46 ago)

@ Astea
Posting: # 22105
Views: 2,739
 

 Dieselpunk

Hi Nastia,

» How does this df correspond to the residual df of ANOVA for getting CVWR? I thought that there should be only
» 40-2=38 degrees of freedom - because from the point of view of the reference drug the full replicate turns to standart 2-way, is it right?

I stand corrected!

library(PowerTOST)
CVCL(CV = 0.3, df = 40-2, side = "2-sided")
 lower CL  upper CL
0.2434049 0.3922851


Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2020-12-02 19:30
(365 d 05:34 ago)

@ Helmut
Posting: # 22106
Views: 2,716
 

 Steampunk? OT

Deatr Both,

» Willard Oodendijk twittered and Nemo Macron said “waouf”.

forgive me old fart, but whois Mr. Oodendijk? Shold I know him? If yes, why? Enlighten me, please.
Maybe I could learn something new in my old age.

Regards,

Detlew
Astea
★★  

Russia,
2020-12-02 19:58
(365 d 05:06 ago)

@ d_labes
Posting: # 22107
Views: 2,706
 

 Steampunk? OT

Dear Detlew!

» forgive me old fart, but whois Mr. Oodendijk? Shold I know him? If yes, why? Enlighten me, please.

That guy is from the neighbouring thread ;-)

"Being in minority, even a minority of one, did not make you mad"
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-12-23 12:18
(344 d 12:46 ago)

@ wienui
Posting: # 22157
Views: 2,438
 

 PowerTOST 1.5-2.9000 on GitHub

Dear Osama and ne[image]ds,

I updated the development version 1.5.2.9000 of PowerTOST on GitHub (it’s not on CRAN yet). If you want to give it a try:

install.packages("remotes")
remotes::install_github("Detlew/PowerTOST")


Some examples in the following. Business as usual (ABE):

sampleN.TOST(CV = 0.29, theta0 = 0.9, design = "2x2x4")

+++++++++++ Equivalence test - TOST +++++++++++
            Sample size estimation
-----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)

alpha = 0.05, target power = 0.8
BE margins = 0.8 ... 1.25
True ratio = 0.9,  CV = 0.29

Sample size (total)
 n     power
38   0.814460


Note that we are close to the switching CVwR 30%. What about ABEL?

sampleN.scABEL(CV = 0.29, theta0 = 0.9, design = "2x2x4",
               regulator = "EMA", details = FALSE)

+++++++++++ scaled (widened) ABEL +++++++++++
            Sample size estimation
   (simulation based on ANOVA evaluation)
---------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1e+05 studies for each step simulated.

alpha  = 0.05, target power = 0.8
CVw(T) = 0.29; CVw(R) = 0.29
True ratio = 0.9
ABE limits / PE constraint = 0.8 ... 1.25
Regulatory settings: EMA

Sample size
 n     power
34   0.8095

Lower sample size than for ABE but – as usual – we would face an inflated Type I Error of 0.07095.

Using the new argument regulator = "GCC".

sampleN.scABEL(CV = 0.29, theta0 = 0.9, design = "2x2x4",
               regulator = "GCC", details = FALSE)

+++++++++++ scaled (widened) ABEL +++++++++++
            Sample size estimation
   (simulation based on ANOVA evaluation)
---------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1e+05 studies for each step simulated.

alpha  = 0.05, target power = 0.8
CVw(T) = 0.29; CVw(R) = 0.29
True ratio = 0.9
ABE limits / PE constraint = 0.8 ... 1.25
Regulatory settings: GCC

Sample size
 n     power
28   0.8014

Even lower sample size than for ABEL because sometimes the CV is misclassified and widened limits of 75.00–133.33% are applied. What about the Type I Error for this approach?

power.scABEL(CV = 0.29, theta0 = 1.25, design = "2x2x4",
             n = 28, regulator = "GCC")
[1] 0.14573

Nasty. Due to the misclassification a huge inflation of the TIE (patient’s risk more than twice of ABEL).
Iteratively adjust α:

scABEL.ad(CV = 0.29, theta0 =0.9, design = "2x2x4",
          n = 28, regulator = "GCC")

+++++++++++ scaled (widened) ABEL ++++++++++++
         iteratively adjusted alpha
   (simulations based on ANOVA evaluation)
----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1,000,000 studies in each iteration simulated.

CVwR 0.29, CVwT 0.29, n(i) 14|14 (N 28)
Nominal alpha                 : 0.05
True ratio                    : 0.9000
Regulatory settings           : GCC (ABE)
Switching CVwR                : 0.3
BE limits                     : 0.8000 ... 1.2500
PE constraints                : 0.8000 ... 1.2500
Empiric TIE for alpha 0.0500  : 0.14573
Power for theta0 0.9000       : 0.801
Iteratively adjusted alpha    : 0.01102
Empiric TIE for adjusted alpha: 0.05000
Power for theta0 0.9000       : 0.602

Substantial loss in power due to evaluation by the 100(1–2×0.01102)=97.796% CI.
Increase the sample size to maintain power (show progress of iterations):

sampleN.scABEL.ad(CV = 0.29, theta0 = 0.9, design = "2x2x4",
                  regulator = "GCC", progress = TRUE, details = TRUE)

+++++++++++ scaled (widened) ABEL ++++++++++++
            Sample size estimation
        for iteratively adjusted alpha
   (simulations based on ANOVA evaluation)
----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1,000,000 studies in each iteration simulated.

Assumed CVwR 0.29, CVwT 0.29
Nominal alpha      : 0.05
True ratio         : 0.9000
Target power       : 0.8
Regulatory settings: GCC (ABE)
Switching CVwR     : 0.3
BE limits          : 0.8000 ... 1.2500
PE constraints     : 0.8000 ... 1.2500
Progress of each iteration:

n  28, nomin. alpha: 0.05000 (power 0.8014), TIE: 0.1457

Sample size search and iteratively adjusting alpha
n  28,   adj. alpha: 0.01102 (power 0.6016), rel. impact on power: -24.94%
n  48,   adj. alpha: 0.00488 (power 0.7372)
n  46,   adj. alpha: 0.00529 (power 0.7274)
n  48,   adj. alpha: 0.00488 (power 0.7372)
n  50,   adj. alpha: 0.00452 (power 0.7461)
n  52,   adj. alpha: 0.00419 (power 0.7550)
n  54,   adj. alpha: 0.00389 (power 0.7629)
n  56,   adj. alpha: 0.00359 (power 0.7703)
n  58,   adj. alpha: 0.00333 (power 0.7787)
n  60,   adj. alpha: 0.00312 (power 0.7850)
n  62,   adj. alpha: 0.00289 (power 0.7924)
n  64,   adj. alpha: 0.00268 (power 0.7989)
n  66,   adj. alpha: 0.00251 (power 0.8058), TIE: 0.05000
Compared to nominal alpha's sample size increase of 135.7% (~study costs).

Runtime    : 79 seconds
Simulations: 97,500,000

Note that the TIE depends strongly on the sample size. Hence, in every step we have to adjust α as well. OK, with 66 subjects we achieve the target power but it comes with a price, namely evaluation by a 99.498% CI…

Inspect the plots of this post again. If the true CVwR > 30% it might misclassified as well but this time towards the conventional limits and the TIE is not inflated. Hence, we get the same sample size by

sampleN.scABEL(CV = 0.31, theta0 = 0.9, design = "2x2x4",
               regulator = "GCC", details = FALSE)

+++++++++++ scaled (widened) ABEL +++++++++++
            Sample size estimation
   (simulation based on ANOVA evaluation)
---------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1e+05 studies for each step simulated.

alpha  = 0.05, target power = 0.8
CVw(T) = 0.31; CVw(R) = 0.31
True ratio = 0.9
ABE limits / PE constraint = 0.8 ... 1.25
Widened limits = 0.75 ... 1.333333
Regulatory settings: GCC

Sample size
 n     power
28   0.8135

and

sampleN.scABEL.ad(CV = 0.31, theta0 = 0.9, design = "2x2x4",
                  regulator = "GCC", details = FALSE)

+++++++++++ scaled (widened) ABEL ++++++++++++
            Sample size estimation
        for iteratively adjusted alpha
   (simulations based on ANOVA evaluation)
----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1,000,000 studies in each iteration simulated.

Assumed CVwR 0.31, CVwT 0.31
Nominal alpha      : 0.05
True ratio         : 0.9000
Target power       : 0.8
Regulatory settings: GCC (ABEL)
Switching CVwR     : 0.3
Regulatory constant: 0.97998
Widened limits     : 0.7500 ... 1.3333
PE constraints     : 0.8000 ... 1.2500

n  28, nomin. alpha: 0.05000 (power 0.8135), TIE: 0.0263
No inflation of the TIE expected; hence, no adjustment of alpha required.


Inflation of the Type I error in different approaches (2-sequence, 4-period full replicate designs):

[image]

  1. Conventional ABE with fixed limits never exceeds nominal α. Since TOST is not a most powerful test, for high CVs combined with small sample sizes, the TIE will be below nominal α. All is good.
  2. For the EMA’s ABEL the maximum inflation occurs at CVwR 30%. If CVwR increases, the TIE decreases since the probability of a misclassification decreases as well. Starting with the upper scaling cap at CVwR 50% limits are fixed and the TIE is driven by the conservatism of TOST – together with the PE-constraint. However, even for very high CVs (not shown) the TIE doesn’t exceed nominal α.
  3. Similar for Health Canada’s ABEL though the minimum TIE is observed at it’s upper cap of ~57.38%.
  4. For the GCC’s widened limits huge inflation of the TIE if CVwR ≤30% (the highest of all approaches). Strong dependency on the sample size. Behaves with increasing CVs like TOST.
  5. Huge inflation of the TIE for the FDA’s RSABE with implied limits if CVwR <30%. Moderate to extremely conservative otherwise. That’s the model all (‼) authors (except ones of the FDA1) considered the applicable one since products are approved according to this model and not according to f.
  6. Lower inflation of the TIE by the FDA’s RSABE “desired consumer risk model”.1 No more than a mathematical pre­stidigitation and called even “hocus pocus” by some.2 Here the maximum inflation of the TIE occurs at ~25.4%.

  1. Davit BM, Chen ML, Conner DP, Haidar SH, Kim S, Lee CH, Lionberger RA, Makhlouf FT, Nwa­kama PE, Patel DT, Schuirmann DJ, Yu LX. Implementation of a Reference-Scaled Average Bioequivalence Approach for Highly Variable Generic Drug Products by the US Food and Drug Administration. AAPS J. 2012: 14(4); 915–24. doi:10.1208/s12248-012-9406-x. [image] PMC Free Full text.
  2. Detlew Labes, László Endrényi, myself…

Edit 2021-01-18: PowerTOST 1.5-3 on CRAN.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Activity
 Admin contact
21,785 posts in 4,556 threads, 1,547 registered users;
online 7 (1 registered, 6 guests [including 3 identified bots]).
Forum time: Friday 01:04 CET (Europe/Vienna)

A drug is that substance which, when injected into a rat,
will produce a scientific report.    Anonymous

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5