Bioequivalence and Bioavailability Forum

Helmut
★★★

Vienna, Austria,
2020-11-25 11:50
(1680 d 13:26 ago)

Posting: # 22084
Views: 13,391

Inflated type I error with fixed widened limits? [RSABE / ABEL]

Dear all,

following this post I had a closer look at the GCC-GL (Version 2.4 of March 2016, page 26):

3.1.10 Highly variable drugs or drug products
Highly variable drug products (HVDP) are those whose intra-subject variability for a parameter is larger than 30%. If an applicant suspects that a drug product can be considered as highly variable in its rate and/or extent of absorption, a replicate cross-over design study can be carried out.
Those HVDP for which a wider difference in C_max is considered clinically irrelevant based on a sound clinical justification can be assessed with a widened acceptance range. If this is the case, a wider acceptance range (i.e. 75–133%) for C_max can be used. For the acceptance interval to be widened the bioequivalence study must be of a replicate design where it has been demonstrated that the within- subject variability for C_max of the reference compound in the study is >30%. The applicant should justify that the calculated intra-subject variability is a reliable estimate and that it is not the result of outliers. The request for widened interval must be prospectively specified in the protocol.
The geometric mean ratio (GMR) should lie within the conventional acceptance range 80.00–125.00%.
The possibility to widen the acceptance criteria based on high intra-subject variability does not apply to AUC where the acceptance range should remain at 80.00 – 125.00% regardless of variability.
It is acceptable to apply either a 3-period or a 4-period crossover scheme in the replicate design study.

What happens if we pre-specify the widened acceptance range and discover in the study that CV_wR ≤30? Assess the study for the conventional AR of 80.00–125.00%? IMHO, that means we have a data-driven decision – which might be false and result in an inflated type I error.

I performed simulations acc. to my understanding of the GL:

[image]

CV_wR = CV_wT = 30%, balanced 4-period 2-sequence full replicate design (TRTR|RTRT), n = 40 (81% power for θ₀ = 0.90), 10⁶ studies with θ₀ = 1.25: ~20.6% passed…

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2020-11-25 16:07
(1680 d 09:09 ago)

@ Helmut
Posting: # 22085
Views: 11,803

Inflated type I error with fixed widened limits?

Post reply

Dear Helmut,

❝ What happens if we pre-specify the widened acceptance range and discover in the study that CV_wR ≤30? Assess the study for the conventional AR of 80.00–125.00%? IMHO, that means we have a data-driven decision – which might be false and result in an inflated type I error.

Correct.

❝ I performed simulations acc. to my understanding of the GL:

❝

[image]

❝ CV_wR = CV_wT = 30%, balanced 4-period 2-sequence full replicate design (TRTR|RTRT), n = 40 (81% power for θ₀ = 0.90), 10⁶ studies with θ₀ = 1.25: ~20.6% passed…

Confirmed! At least in magnitude.
My result: 20.46% passed.
Quick and dirty captured code from power.scABEL.sds.

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2020-11-26 01:09
(1680 d 00:08 ago)

@ d_labes
Posting: # 22086
Views: 11,814

Inflated type I error with fixed widened limits?

Post reply

Dear Detlew,

❝ ❝ […] we have a data-driven decision – which might be false and result in an inflated type I error.

❝

❝ Correct.

Shit.

❝ ❝ I performed simulations acc. to my understanding of the GL: […] 10⁶ studies with θ₀ = 1.25: ~20.6% passed…

❝ Confirmed! At least in magnitude.

❝ My result: 20.46% passed.

❝ Quick and dirty captured code from power.scABEL.sds.

How, did you do that? That’s beyond me.

I misused your old subject sim code and added:

ow <- options("digits") nsims <- 1e6 CVT <- CVR <- 0.3 sWT <- CV2se(CVT) sWR <- CV2se(CVR) sBT <- CV2se(CVT*2) sBR <- CV2se(CVR*2) n <- c(20, 20) # power ~0.81 for theta0 = 0.9 theta0 <- 1.25 # for TIE options(digits = 12) on.exit(options(ow)) set.seed(123456) mvc <- mean_vcov(c("TRTR", "RTRT"), muR = log(100), ldiff = log(theta0), sWT = sWT, sWR = sWR, sBT = sBT, sBR = sBR, rho = 1) res <- data.frame(CVwR = rep(NA, nsims), PE = NA, lower = NA, upper = NA, L = NA, U = NA, BE = FALSE) pb <- txtProgressBar(0, 1, 0, char="\u2588", width=NA, style=3) st <- proc.time()[[3]] for (j in 1:nsims) { data <- prep_data(seqs = c("TRTR", "RTRT"), n = n, metric = "PK", dec = 5, mvc_list = mvc) data <- data[, !(names(data) %in% c("seqno", "logval"))] cols <- c("subject", "period", "sequence", "treatment") data[cols] <- lapply(data[cols], factor) modBE <- lm(log(PK) ~ sequence + subject + period + treatment, data = data) res[j, 2:3] <- round(100*as.numeric(exp(confint(modBE, "treatmentT", level = 1-2*0.05))), 2) res$PE[j] <- round(100*as.numeric(exp(coef(modBE)[["treatmentT"]])), 2) modCVR <- lm(log(PK) ~ sequence + subject + period, data = data[data$treatment == "R", ]) res$CVwR[j] <- mse2CV(anova(modCVR)["Residuals", "Mean Sq"]) ifelse (res$CVwR[j] <= 0.3, res$L[j] <- 80, res$L[j] <- 75) res$U[j] <- 100/(res$L[j]*0.01) if (res$lower[j] >= res$L[j] & res$upper[j] <= res$U[j] & res$PE[j] >= 80 & res$PE[j] <= 125) res$BE[j] <- TRUE setTxtProgressBar(pb, j/nsims) } et <- proc.time()[[3]] close(pb) options(ow) # restore options cat(paste0(j, " subj. simulations with theta0 = ", theta0, ": ", signif(sum(res$BE[!is.na(res$L)])/j, 5), " passed BE (empiric TIE)\nRuntime: ", signif((et-st)/60, 4), " minutes\n"))

And got 20.568% after eleven (!) hours.
PS: Only one of the sim’s failed on the PE restriction.

res[(res$lower >= res$L & res$upper <= res$U) & (res$PE <80 | res$PE >125), ]


            CVwR     PE  lower upper  L        U    BE

566300 0.3000039 125.03 117.36 133.2 75 133.3333 FALSE

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2020-11-26 16:38
(1679 d 08:38 ago)

@ Helmut
Posting: # 22087
Views: 11,773

Inflated type I error with fixed widened limits

Post reply

Dear Helmut,

❝ ❝ ❝ […] we have a data-driven decision – which might be false and result in an inflated type I error.

❝ ❝

❝ ❝ Correct.

❝

❝ Shit.

Why shit?

❝ ❝ ❝ I performed simulations acc. to my understanding of the GL: […] 10⁶ studies with θ₀ = 1.25: ~20.6% passed…

❝

❝ ❝ Confirmed! At least in magnitude.

❝ ❝ My result: 20.46% passed.

❝ ❝ Quick and dirty captured code from power.scABEL.sds.

❝

❝ How, did you do that? That’s beyond me.

Stolen the code of subject data sims and scaled ABE evaluation in the working horse function .pwr.SABE.sds().
Implementing the GCC rules instead of EMA rules or RSABE for FDA is simple.
Have a look into the code of the new but undocumented function power.fwl.sds() in the GitHub repository.

❝ ...And got 20.568% after eleven (!) hours.

Wow! 11 hours for one number! 10⁶ sims in ca. 4-5 sec with my implementation.
Some results with setseed=FALSE:
power.fwl.sds(CV=0.3, n=40, theta0=0.8, design="2x2x4", setseed=F)
0.2045, 0.2066, 0.2047, 0.2051, 0.2082
Seems we are simulating the same number :cool:

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2020-11-26 18:14
(1679 d 07:03 ago)

@ d_labes
Posting: # 22088
Views: 11,975

Inflated type I error: Not nice

Post reply

Dear Detlew,

❝ ❝ Shit.

❝

❝ Why shit?

Cause it slipped through my attention for almost fifteen years… In South Africa for C_max fixed limits of 75.00–133.33% may be used (no replicate design needed).

❝ ❝ ❝ Quick and dirty captured code from power.scABEL.sds.

❝ ❝

❝ ❝ How, did you do that? That’s beyond me.

❝

❝ Stolen the code of subject data sims and scaled ABE evaluation in the working horse function .pwr.SABE.sds().

❝ Have a look into the code of the new but undocumented function power.fwl.sds() in the GitHub repository.

THX!

❝ ❝ ...And got 20.568% after eleven (!) hours.

❝ Wow! 11 hours for one number! 10⁶ sims in ca. 4-5 sec with my implementation.

Mühsam ernährt sich das Eichhörnchen.

❝ Some results with setseed=FALSE:

❝ power.fwl.sds(CV=0.3, n=40, theta0=0.8, design="2x2x4", setseed=F)

❝ 0.2045, 0.2066, 0.2047, 0.2051, 0.2082

❝ Seems we are simulating the same number :cool: .

Terrible for drugs which are not highly variable (reminds me on the FDA’s RSABE). The design is less important.

[image] ^{2-sequence 4-period (full) replicates}

[image] ^{2-sequence 3-period (full) replicates}

[image] ^{3-sequence 3-period (partial) replicates}

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2020-11-28 12:20
(1677 d 12:57 ago)

@ Helmut
Posting: # 22091
Views: 11,603

Power of the GCC framework and power of PowerTOST

Post reply

Dear Helmut,

❝ Stolen the code of subject data sims and scaled ABE evaluation in the working horse function .pwr.SABE.sds().

❝ Have a look into the code of the new but undocumented function power.fwl.sds() in the GitHub repository.

there is no need to define and program a new function :cool:

!
Try the following:

# define a new regulator object

regGCC <- reg_const("user", r_const= log(1/0.75)/CV2se(0.3), CVswitch = 0.3, CVcap = 0.3,

                    pe_const=TRUE)

regGCC

USER defined regulatory settings

- CVswitch            = 0.3 

- cap on scABEL if CVw(R) > 0.3

- regulatory constant = 0.9799758 

- pe constraint applied

# verify if the limits are what they should be

scABEL(CV=0.2, regulator = regGCC)

lower upper 

 0.80  1.25 

scABEL(CV=0.3, regulator = regGCC)

lower upper 

 0.80  1.25 

scABEL(CV=0.31, regulator = regGCC)

   lower    upper 

0.750000 1.333333 

scABEL(CV=0.35, regulator = regGCC)

   lower    upper 

0.750000 1.333333 

# use the function already in the package

power.scABEL.sds(CV=0.3, n=40, design="2x2x4", regulator = regGCC, theta0 = 1.25)

[1] 0.20463

Result matches the previous calculations.

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2020-11-29 12:21
(1676 d 12:55 ago)

@ d_labes
Posting: # 22092
Views: 11,581

Sample sizes (ignoring the inflated TIE)

Post reply

Dear Detlew,

❝ there is no need to define and program a new function :cool: !

Great!

library(PowerTOST) design <- "2x2x4" regGCC <- reg_const("user", r_const = log(1/0.75)/CV2se(0.3), CVswitch = 0.3, CVcap = 0.3, pe_const = TRUE) CV <- sort(c(seq(0.2, 0.6, 0.1), 0.301)) res <- data.frame(CV = CV, L.EMA = scABEL(CV, regulator = "EMA")[, 1], U.EMA = scABEL(CV, regulator = "EMA")[, 2], n.EMA = NA, L.GCC = scABEL(CV, regulator = regGCC)[, 1], U.GCC = scABEL(CV, regulator = regGCC)[, 2], n.GCC = NA) for (j in 1:nrow(res)) { res$n.EMA[j] <- sampleN.scABEL(CV = CV[j], design = design, regulator = "EMA", details = FALSE, print = FALSE)[["Sample size"]] res$n.GCC[j] <- sampleN.scABEL(CV = CV[j], design = design, regulator = regGCC, details = FALSE, print = FALSE)[["Sample size"]] } print(res, row.names = FALSE) CV L.EMA U.EMA n.EMA L.GCC U.GCC n.GCC 0.200 0.8000000 1.250000 18 0.80 1.250000 18 0.300 0.8000000 1.250000 34 0.75 1.333333 28 0.301 0.7994604 1.250844 34 0.75 1.333333 28 0.400 0.7461770 1.340165 30 0.75 1.333333 30 0.500 0.6983678 1.431910 28 0.75 1.333333 42 0.600 0.6983678 1.431910 32 0.75 1.333333 58

PS: Do we have a bug in scABEL()?

CV <- c(0.3, 0.3+10^(-6:-2)) lim <- scABEL(CV) res <- data.frame(CV = CV, L = lim[, 1], U = lim[, 2]) print(res, row.names = FALSE) CV L U 0.300000 0.8000000 1.250000 0.300001 0.8000296 1.249954 0.300010 0.8000244 1.249962 0.300100 0.7999731 1.250042 0.301000 0.7994604 1.250844 0.310000 0.7943616 1.258873

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2020-11-29 17:22
(1676 d 07:55 ago)

(edited on 2020-11-29 17:57)
@ Helmut
Posting: # 22094
Views: 11,481

Bug?

Post reply

Dear Helmut,

❝ PS: Do we have a bug in scABEL()?

❝ CV <- c(0.3, 0.3+10^(-6:-2))

❝ lim <- scABEL(CV)

❝ res <- data.frame(CV = CV, L = lim[, 1], U = lim[, 2])

❝ print(res, row.names = FALSE)

❝

❝ CV L U

❝ 0.300000 0.8000000 1.250000

❝ 0.300001 0.8000296 1.249954

❝ 0.300010 0.8000244 1.249962

❝ 0.300100 0.7999731 1.250042

❝ 0.301000 0.7994604 1.250844

❝ 0.310000 0.7943616 1.258873

Not a bug

, it's a feature :cool:

!
Reason: r_const=0.76.
With r_const <- log(1.25)/CV2se(0.3) = 0.7601283... we get

       CV         L        U

 0.300000 0.8000000 1.250000

 0.300001 0.7999994 1.250001

 0.300010 0.7999943 1.250009

 0.300100 0.7999430 1.250089

 0.301000 0.7994302 1.250891

 0.310000 0.7943307 1.258922

Satisfied?

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2020-11-30 01:16
(1676 d 00:01 ago)

@ d_labes
Posting: # 22097
Views: 11,445

“Feature”

Post reply

Dear Detlew,

❝ Not a bug :no: , it's a feature :cool: !

❝ Reason: r_const=0.76.

❝ With r_const <- log(1.25)/CV2se(0.3) = 0.7601283... we get

❝ CV L U

❝ 0.300000 0.8000000 1.250000

❝ 0.300001 0.7999994 1.250001

❝ 0.300010 0.7999943 1.250009

❝ 0.300100 0.7999430 1.250089

❝ 0.301000 0.7994302 1.250891

❝ 0.310000 0.7943307 1.258922

❝ Satisfied?

Fuck! I already guessed that. I hate “nice numbers”.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

wienui ★ Germany/Oman, 2020-11-29 15:03 (1676 d 10:14 ago) @ Helmut Posting: # 22093 Views: 11,515	Inflated type I error with fixed widened limits? Post reply
	Hi Helmut & Detlew, very interesting. could you please kindly enlighten it more for not great statistician like I, and how could this type I error inflation happen? Thanks in advance — Cheers, Osama

d_labes
★★★

Berlin, Germany,
2020-11-29 18:51
(1676 d 06:25 ago)

@ wienui
Posting: # 22095
Views: 11,534

Inflated type I error with fixed widened limits

Post reply

Dear Osama,

❝ ... how could this type I error inflation happen?

Imagine, you have a true CVwR of 30%. That means your true BE limits should be 80% ... 125%.
Now you performe a study but obtain an estimate of CVwR of 35%. Given this you use the widened (fixed) BE acceptance range of 75% ... 133.33%. With that acceptance range the BE decision is easier to obtain. This also happens if the true theta0 (GMR) is 125% (or 80%), i.e. bioinequivalence is what we have. Thus more than 5% of BE decision result, that means there is an alpha inflation.

Hope this makes sense to you.
May be Helmut has a better (paedogogical) explanation because he is an exceptionally gifted preacher :cool:

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2020-11-30 01:14
(1676 d 00:02 ago)

@ d_labes
Posting: # 22096
Views: 11,628

Inflated type I error with fixed widened limits

Post reply

Dear Detlew & Osama,

❝ ❝ ... how could this type I error inflation happen?

❝ May be Helmut has a better (paedogogical) explanation because he is an exceptionally gifted preacher :cool: .

Only you say so. ;-)

I’ll try.

Let’s start with two power curves (first [image]

-script at the end). The 2-sequence 4-period replicate study is designed for ABE with conventional limits, 80% power. Since we are wary, we assume a $\small{\theta_0}$ (T/R-ratio) of 0.90 and estimate a sample size of 40. One power curve for the conventional limits as planned (──) and one for the widened limits (──).

[image]

The probability of a type I error (i.e., the consumer risk) to be treated with a formulation which is not BE (true $\small{\theta_0}$ outside the limits) is ≤5%. You see that the blue power curve intersects 0.05 at $\small{\theta_0=0.80}$ and $\small{\theta_0=1.25}$. That means also the chance of falsely passing BE is 5%. The same is applicable to South Africa’s approach where we could use pre-specified widened limits independent from the observed CV. Then we have a TIE of 0.05 as well (the red power curve intersects 0.05 at $\small{\theta_0=0.75}$ and $\small{\theta_0=1.\dot{3}}$).

Imagine an even more extreme case than the one Detlew mentioned. We observe in the study a CV_wR of 30.01% although the true one is 30%. That means we will use the widened limits of 75.00–133.33% instead of the correct 80.00–125.00%. With the same $\small{\theta_0=0.80}$ and $\small{\theta_0=1.25}$ we jump to the red power curve and therefore, have a much higher chance of passing (~40% instead of 5%).

Now for the tricky part (sorry): In the previous posts we estimated an inflation of the TIE of ~20%. Why not 40%? Say, the true CV_wR is 30% and acc. to the convention the drug is considered to be not highly variable. Hence, we would have to apply the conventional limits. There is a ~50% chance that in the actual study we will observe a CV_wR >30% and misclassify the drug/formulation as highly variable and apply the widened limits. But there is also a ~50% chance that we observe a CV_wR ≤30% and use the conventional limits. Both together gives the ~20% inflation (actually it is more complicated*).

What if we estimate the sample size for the GCC’s approach? With 28 it will be lower (second [image]

-script).

[image]

Since with a CV of 30% we have to use the conventional limits (power at $\small{\theta_0=0.80}$ and $\small{\theta_0=1.25}$ will be ~0.16) and we felt into the trap of an inflated type I error. Note that the TIE depends also on the sample size. Hence, it will be smaller than with 40 subjects.

If you think about iteratively adjusted α like for the reference-scaling methods (third [image]

-script) – examples for the 2x2x4 design (sample sizes estimated for the GCC’s approach):

CV = 0.150, n = 12 (power = 0.8496) TIE = 0.050148: inflated (α-adjustment necessary). 2 iterations: adjusted α = 0.049854 (90.029% CI), TIE = 0.05 (power = 0.8493). CV = 0.200, n = 18 (power = 0.8015) TIE = 0.050910: inflated (α-adjustment necessary). 7 iterations: adjusted α = 0.049082 (90.184% CI), TIE = 0.05 (power = 0.7988). CV = 0.250, n = 28 (power = 0.8220) TIE = 0.072558: inflated (α-adjustment necessary). 5 iterations: adjusted α = 0.032075 (93.585% CI), TIE = 0.05 (power = 0.7644). CV = 0.300, n = 28 (power = 0.8079) TIE = 0.164829: inflated (α-adjustment necessary). 8 iterations: adjusted α = 0.009071 (98.186% CI), TIE = 0.05 (power = 0.5859). CV = 0.301, n = 28 (power = 0.8085) TIE = 0.022390: not inflated (no α-adjustment necessary).

However, we could face a substantial loss in power (for CV 30% and the adjusted α of ~0.0091 it would drop from 81% to 59%).

[image]

1. Power-curves for fixed limits

library(PowerTOST) design <- "2x2x4" # any replicate - see known.designs() CV <- 0.30 target <- 0.80 theta0 <- 0.90 n <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target, design = design, details = FALSE, print = FALSE)[["Sample size"]] theta0 <- c(theta0, seq(0.75*0.96, 1, length.out = 60)) theta0 <- sort(unique(c(theta0, 0.8, 1, 1.25, 1/theta0))) res <- data.frame(theta0 = theta0) for (j in seq_along(theta0)) { res$pwr1[j] <- power.TOST(CV = CV, n = n, theta0 = theta0[j], theta1 = 0.80, design = design) res$pwr2[j] <- power.TOST(CV = CV, n = n, theta0 = theta0[j], theta1 = 0.75, design = design) } if (is.null(attr(dev.list(), "names"))) windows(width = 4.5, height = 3.3, record = TRUE) op <- par(no.readonly = TRUE) par(mar = c(4, 4, 0, 0) + 0.1, cex.axis = 0.8) plot(log(theta0), res$pwr1, type = "n", axes = FALSE, yaxs = "i", xlim = log(c(0.75, 1/0.75)), ylim = c(0, 1.04), xlab = expression(theta[0]), ylab = "power") axis(2, las = 1) axis(2, at = 0.05, las = 1) axis(1, at = log(c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)), labels = sprintf("%.2f", c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75))) abline(h = pretty(c(0, 1)), lty = 3, col = "lightgrey") abline(h = 0.05, lty = 2) abline(v = log(c(0.9, 1, 1.1, 1.2)), lty = 3, col = "lightgrey") abline(v = log(c(0.75, 0.8, 1.25, 1/0.75)), lty = 2) box() lines(log(theta0), res$pwr1, lwd = 3, col = "blue") lines(log(theta0), res$pwr2, lwd = 3, col = "red") par(op)

2. Power curves for sample sizes acc. to the GCC’s approach

library(PowerTOST) design <- "2x2x4" # only "2x2x4", "2x2x3", "2x3x3" implemented! CV <- 0.30 target <- 0.80 theta0 <- 0.90 reg <- reg_const("user", r_const = log(1/0.75)/CV2se(0.3), CVswitch = 0.3, CVcap = 0.3, pe_const = TRUE) n <- sampleN.scABEL(CV = CV, theta0 = theta0, targetpower = target, design = design, regulator = reg, details = FALSE, print = FALSE)[["Sample size"]] theta0 <- c(theta0, seq(0.75*0.96, 1, length.out = 60)) theta0 <- sort(unique(c(theta0, 0.8, 1, 1.25, 1/theta0))) res <- data.frame(theta0 = theta0) for (j in seq_along(theta0)) { res$pwr[j] <- power.scABEL(CV = CV, n = n, theta0 = theta0[j], theta1 = 0.80, design = design, regulator = reg) } if (is.null(attr(dev.list(), "names"))) windows(width = 4.5, height = 3.3, record = TRUE) op <- par(no.readonly = TRUE) par(mar = c(4, 4, 0, 0) + 0.1, cex.axis = 0.8) plot(log(theta0), res$pwr, type = "n", axes = FALSE, yaxs = "i", xlim = log(c(0.75, 1/0.75)), ylim = c(0, 1.04), xlab = expression(theta[0]), ylab = "power") axis(2, las = 1) axis(2, at = 0.05, las = 1) axis(1, at = log(c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)), labels = sprintf("%.2f", c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75))) abline(h = pretty(c(0, 1)), lty = 3, col = "lightgrey") abline(h = 0.05, lty = 2) abline(v = log(c(0.9, 1, 1.1, 1.2)), lty = 3, col = "lightgrey") abline(v = log(c(0.8, 1.25)), lty = 2) if (CV > 0.3) abline(v = log(c(0.75, 1/0.75)), lty = 2) box() lines(log(theta0), res$pwr, lwd = 3, col = "blue") par(op)

3. Iteratively adjusted α for the GCC’s approach

library(PowerTOST) opt <- function(x) { if (sdsims) { power.scABEL.sds(alpha = x, CV = CV, n = n, theta0 = U, design = design, regulator = reg, nsims = 1e6, setseed = setseed, progress = FALSE) - alpha } else { power.scABEL(alpha = x, CV = CV, n = n, theta0 = U, design = design, regulator = reg, nsims = 1e6, setseed = setseed) - alpha } } design <- "2x2x4" # only "2x2x4", "2x2x3", "2x3x3" implemented! alpha <- 0.05 # nominal level of the test CV <- 0.30 # can be a 2-element vector: CV[1] for T, CV[2] for R theta0 <- 0.90 # assumed T/R-ratio target <- 0.80 # target (desired) power setseed <- TRUE # for reproducibility sdsims <- FALSE # set to TRUE for partial replicate (much slower) reg <- reg_const("user", r_const = log(1/0.75)/CV2se(0.3), CVswitch = 0.3, CVcap = 0.3, pe_const = TRUE) # power and sample size for the GCC’s approach x <- sampleN.scABEL(CV = CV, theta0 = theta0, design = design, targetpower = target, regulator = reg, details = FALSE, print = FALSE) power <- x[["Achieved power"]] n <- x[["Sample size"]] U <- scABEL(CV = CV, regulator = reg)[["upper"]] power <- power.TOST(CV = CV, n = n, theta0 = theta0, design = design) if (!sdsims) { # simulate underlying statistics TIE <- power.scABEL(CV = CV, n = n, theta0 = U, design = design, regulator = reg, nsims = 1e6) } else { # simulate subject data TIE <- power.scABEL.sds(CV = CV, n = n, theta0 = U, design = design, regulator = reg, nsims = 1e6, progress = FALSE) } txt <- paste0("CV = ", sprintf("%.3f", CV), ", n = ", n, " (power = ", sprintf("%.4f)", power)) if (TIE <= alpha) { txt <- paste0(txt, "\nTIE = ", sprintf("%.6f", TIE), ": not inflated ", "(no \u03B1-adjustment necessary).\n") } else { txt <- paste0(txt, "\nTIE = ", sprintf("%.6f", TIE), ": inflated ", "(\u03B1-adjustment necessary).") x <- uniroot(opt, interval = c(0, alpha), tol = 1e-8) alpha.adj <- x$root TIE.adj <- alpha + x$f.root power.adj <- power.TOST(alpha = alpha.adj, CV = CV, n = n, theta0 = theta0, design = design) txt <- paste0(txt, "\n", sprintf("%2i", x$iter), " iterations: ", "adjusted \u03B1 = ", sprintf("%.6f", alpha.adj), " (", sprintf("%.3f%%", 100*(1-2*alpha.adj)), " CI), TIE = ", TIE.adj, " (power = ", sprintf("%.4f)", power.adj), ".\n") } cat(txt)

$\small{\theta_0}$ follows a lognormal distribution and $\small{CV_\textrm{wR}}$ a $\small{\chi^2}$ distribution. Both distributions are not symmetric but skewed to the right:

Hence, at a true $\small{\theta_0}$ of 1.25 and a true $\small{CV_\textrm{wR}}$ of 30% in a particular study the chance of a classify the drug falsely as highly variable (based on the observed $\small{CV_\textrm{wR}}$) and proceed with scaling is slightly higher than 50%.

Edit: We implemented regulator = "GCC" in version 1.5-3 (2021-01-18) of PowerTOST. Example:

library(PowerTOST) design <- "2x2x4 CV <- 0.30 theta0 <- 0.90 target <- 0.80 n <- sampleN.scABEL(CV = CV, theta0 = theta0, design = design, targetpower = target, regulator = "GCC", details = FALSE, print = FALSE)[["Sample size"]] scABEL.ad(CV = CV, theta0 = theta0, design = design, n = n, regulator = "GCC", details = TRUE) +++++++++++ scaled (widened) ABEL ++++++++++++ iteratively adjusted alpha (simulations based on ANOVA evaluation) ---------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) 1,000,000 studies in each iteration simulated. CVwR 0.3, CVwT 0.3, n(i) 14|14 (N 28) Nominal alpha : 0.05 True ratio : 0.9000 Regulatory settings : GCC (ABE) Switching CVwR : 0.3 BE limits : 0.8000 ... 1.2500 PE constraints : 0.8000 ... 1.2500 Empiric TIE for alpha 0.0500 : 0.16483 (rel. change of risk: +230%) Power for theta0 0.9000 : 0.808 Iteratively adjusted alpha : 0.00907 Empiric TIE for adjusted alpha: 0.05000 Power for theta0 0.9000 : 0.586 (rel. impact: -27.5%) Runtime : 6.6 seconds Simulations: 8,100,000 (7 iterations)

The impact on power is massive. Which sample size would we need to maintain the target power?

sampleN.scABEL.ad(CV = CV, theta0 = theta0, design = design, targetpower = target, regulator = "GCC", details = TRUE) +++++++++++ scaled (widened) ABEL ++++++++++++ Sample size estimation for iteratively adjusted alpha (simulations based on ANOVA evaluation) ---------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) 1,000,000 studies in each iteration simulated. Assumed CVwR 0.3, CVwT 0.3 Nominal alpha : 0.05 True ratio : 0.9000 Target power : 0.8 Regulatory settings: GCC (ABE) Switching CVwR : 0.3 BE limits : 0.8000 ... 1.2500 PE constraints : 0.8000 ... 1.2500 n 28, nomin. alpha: 0.05000 (power 0.8079), TIE: 0.1648 Sample size search and iteratively adjusting alpha n 28, adj. alpha: 0.00907 (power 0.5859), rel. impact on power: -27.48% n 48, adj. alpha: 0.00343 (power 0.7237) n 46, adj. alpha: 0.00376 (power 0.7136) n 48, adj. alpha: 0.00343 (power 0.7237) n 50, adj. alpha: 0.00313 (power 0.7330) n 52, adj. alpha: 0.00283 (power 0.7402) n 54, adj. alpha: 0.00258 (power 0.7490) n 56, adj. alpha: 0.00233 (power 0.7554) n 58, adj. alpha: 0.00215 (power 0.7641) n 60, adj. alpha: 0.00198 (power 0.7703) n 62, adj. alpha: 0.00180 (power 0.7789) n 64, adj. alpha: 0.00164 (power 0.7851) n 66, adj. alpha: 0.00152 (power 0.7909) n 68, adj. alpha: 0.00138 (power 0.7958) n 70, adj. alpha: 0.00126 (power 0.8010), TIE: 0.05000 Compared to nominal alpha's sample size increase of 150.0% (~study costs). Runtime : 96.4 seconds Simulations: 120,700,000

Oops! Since the TIE depends on the sample size itself (see the plots in this post), we have to adjust more.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

wienui ★ Germany/Oman, 2020-11-30 04:30 (1675 d 20:46 ago) @ Helmut Posting: # 22098 Views: 11,471	Inflated type I error with fixed widened limits Post reply
	Dear Detlew & Helmut, Thank you for the brilliant explanation. ❝ ❝ May be Helmut has a better (paedogogical) explanation because he is an exceptionally gifted preacher . ❝ Only you say so. I’ll try. Kein Wunder, dass Ihr beiden den Titel eines grossen Prediger verdient. — Cheers, Osama

Helmut
★★★

Vienna, Austria,
2020-11-30 15:42
(1675 d 09:34 ago)

@ wienui
Posting: # 22099
Views: 11,411

Houston, we have a problem!

Post reply

Dear Osama,

❝ Thank you for the brilliant explanation.

You are welcome! In the meantime I added more stuff to my post.
If my interpretation of the GL is correct (is it?) and applied as such by members of the GCC, we have a problem if the CV_wR observed in the study is ≤30%.

What could be done?

Regulatory side
- If high variability is suspected by the applicant, allow pre-specified wider limits for C_max irrespective of the observed CV – like currently in South Africa and acceptable by the EMEA prior to 2006.¹ Does not even require a replicate design.
- Implement ABEL instead. In line not only with the EMA but many other jurisdictions (the WHO, ASEAN States, Australia, Brazil, Canada, Chile, the East African Community, Egypt, the Eurasian Economic Union, New Zealand, the Russian Federation). A step towards global harmonization. Lower inflation of the type I error if CV_wR ≤30% than with the current approach. However, inflation of the TIE also if CV_wR >30% (up to ~45%), whereas there is none in the current approach. Many publications dealing with the issue; iteratively adjusting α is provided by PowerTOST’s function scABEL.ad(). Sample size estimation to compensate for the potential loss of power is provided by the function sampleN.scABEL.ad().
- Implementing RSABE (USA, China) would be no good idea. Nasty inflation of the TIE if CV_wR <30% as well…
Applicant’s side
- Ask the authority whether ABEL is an acceptable alternative to the current approach. Is it already?
- If not, adjust α with my -script. Be aware of the potential loss in power!
  ~~Maybe (‼) I will implement it in scABEL.ad() and sampleN.scABEL.ad(). No promises.~~
  Edit: See this post.
Utopia
- Within the last ten years many replicate studies were performed. Hence, we simply know a good number of drugs / drug products which are highly variable and pose no safety concerns. Sometimes entire classes of drugs are highly variable (e.g., proton-pump inhibitors). Agencies could simply recommend widened limits in product-specific guidelines. No clinical justification² needed by applicants, no replicate design needed, no issues with inflation of the TIE. Sigh.

In the EU lots of accepted studies with 75.00–133.33%. Prior to 2001 limits of 70–143% were not uncommon for C_max. Sometimes even for AUC…
An often overlooked detail. Regularly difficult to provide for generic companies with no access to the originator’s data. Generally just a lot of in the protocol.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Astea
★★

Russia,
2020-12-02 00:34
(1674 d 00:43 ago)

@ Helmut
Posting: # 22100
Views: 11,447

Paradox of tolerance

Post reply

Dear Preachers!

You've discovered truly a very interesting feature!
But I have some doubts in logical equality of the inflation of TIE and consumer's risk. Can you please explain my faults in the following reasoning?

Suppose we expect drug A to be highly variable (in the previous study somewhere in Antarctica W. Oodendijk et al. have got CV>30% for the reference drug). Which of the following options should we prefer to write in the protocol in order to care of the customer:

a). Use pre-specified wider limits 75-133 for C_max (no inflation?)
b). Use the GCC-GL approach (inflation up to 21%?)

Suppose that at the end of the trial we get CV≤30% and CI within 75-133, but out of 80-125.
Then for the a-approach we should conclude the drug BE, for the b-approach - fail to conclude BE.
That is the risk of the customer to get a bad product is higher in the first approach if we define "a bad product" as a non-HVD with the limits out of 80-125.
The difference is in the fact that in the first approach we proclaim the drug to be good if it is within the limits 75-133.

Until about 2013 there were a lot of studies in Russia with 75-133 limits for C_max even for non-HVD drugs.

—
"Being in minority, even a minority of one, did not make you mad"

Helmut
★★★

Vienna, Austria,
2020-12-02 02:37
(1673 d 22:40 ago)

@ Astea
Posting: # 22101
Views: 11,176

Οὐτοπεία ∨ Εὐτοπεία

Post reply

Hi Nastia

❝ […] I have some doubts in logical equality of the inflation of TIE and consumer's risk. Can you please explain my faults in the following reasoning?

❝

❝ Suppose we expect drug A to be highly variable (in the previous study somewhere in Antarctica W. Oodendijk et al. have got CV>30% for the reference drug).

The problem starts already here. How reliable is Oodendijk’s result? Is it the only one? Did the agency agree that the drug is HV and wider limits can be used?

❝ Which of the following options should we prefer to write in the protocol in order to care of the customer:

❝

❝ a). Use pre-specified wider limits 75-133 for C_max (no inflation?)

❝ b). Use the GCC-GL approach (inflation up to 21%?)

The crucial point is what we consider a “clinically not relevant $\small{\Delta}$”. Muse a bit on these goodies:
$$\small{\Delta=20\%\implies\left\{\theta_1=80.00\%,\,\theta_2=125.00\%\right\}}\tag{1}$$ $$\small{\Delta=25\%\implies\left\{\theta_1=75.00\%,\,\theta_2=133.3\dot{3}\%\right\}}\tag{a}$$ $$\small{\Delta\: \overset{{\color{Red} ?}}{\rightarrow}\,\begin{vmatrix}
\widehat{CV_\textrm{wR}}\leq30\%\rightarrow \widehat{\Delta}=20\%\\
\widehat{CV_\textrm{wR}}>30\%\rightarrow \widehat{\Delta}=25\%
\end{vmatrix}\implies\begin{Bmatrix}
\theta_1=80.00\%,\,\theta_2=125.00\%\\
\theta_1=75.00\%,\,\theta_2=133.3\dot{3}\%
\end{Bmatrix}}\tag{b}$$ $\small{(1)}$ and $\small{(\textrm{a})}$ are straightforward. Fixed limits, type I error always ≤ the nominal $\small{\alpha}$.
$\small{(\textrm{b})}$ is data-driven (like ABEL and RSABE), since it depends on the estimated $\small{CV_\textrm{wR}}$. The Null-hypothesis is like Schrödinger’s cat – or Wigner’s friend, if you prefer. The study (not based on clinical grounds by the applicant and regulator like in $\small{(1)}$ and $\small{(\textrm{a})}$) “decides” which $\small{\widehat{\Delta}}$ is acceptable for the patient. That’s not a particularly good idea. By definition (‼) any framework (or a pre-test) might lead to a false decision and hence, inflates the TIE. That’s a multiplicity problem, which – if not adjusted – will increase the familywise error rate.

❝ Suppose that at the end of the trial we get CV≤30% and CI within 75-133, but out of 80-125.

❝ Then for the a-approach we should conclude the drug BE, …

If (a) was stated in the protocol and accepted by the agency, fine. The CV is interesting though not relevant. Try the function CVCL() in PowerTOST. Might be pure chance (well include >30%).

❝ … for the b-approach - fail to conclude BE.

No risk, no fun.

❝ That is the risk of the customer to get a bad product is higher in the first approach if we define "a bad product" as a non-HVD with the limits out of 80-125.

Nope. In (a) you accept beforehand that $\small{\Delta=25\%}$ is not relevant for the patient. But again: You don’t assess the CV at all. Maybe it is HV indeed (like in Antarctica).

❝ The difference is in the fact that in the first approach we proclaim the drug to be good if it is within the limits 75-133.

Correct.

❝ Until about 2013 there were a lot of studies in Russia with 75-133 limits for C_max even for non-HVD drugs.

Interesting. However:

library(PowerTOST) CVCL(CV = 0.27, df = 3*40-4, side ="2-sided") # 4-period full replicate, n = 40 lower CL upper CL 0.2383666 0.3116013

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Astea
★★

Russia,
2020-12-02 10:46
(1673 d 14:31 ago)

@ Helmut
Posting: # 22102
Views: 11,106

Uchronia

Post reply

Dear Helmut!

Thank you for the prompt answer!

❝ The problem starts already here. How reliable is Oodendijk’s result? Is it the only one?

The reliability of someone else's data - that is the question (especially when one of the authors says "waouf" :-)

❝ The crucial point is what we consider a “clinically not relevant $\small{\Delta}$”

As far as we (and the Agency) proclaim 25% to be clinically not relevant there is no difference in the rate of the harm for the customer's health independently from the a- or b- approach. For the b-approach he'll just receive not worser drug or doesn't receive it at all.

❝ Try the function CVCL() in PowerTOST.

I try:

library(PowerTOST) CVCL(CV = 0.3, df = 3*40-4, side ="2-sided") lower CL upper CL 0.2646219 0.3466708

As CI is shifted to the right - does it mean that for these initial conditions the probability of the conclusion of HV is higher?
(By the way shouldn't we lower the degrees of freedom for the CV of the reference drug? 3*40-3 should correspond to the common CV of the Test and Reference, shouldn't it?)

—
"Being in minority, even a minority of one, did not make you mad"

Helmut
★★★

Vienna, Austria,
2020-12-02 12:34
(1673 d 12:42 ago)

@ Astea
Posting: # 22103
Views: 11,097

Steampunk

Post reply

Hi Nastia,

❝ ❝ The problem starts already here. How reliable is Oodendijk’s result? Is it the only one?

❝

❝ The reliability of someone else's data - that is the question (especially when one of the authors says "waouf" :-)

Willard Oodendijk twittered and Nemo Macron said “waouf”.

❝ ❝ The crucial point is what we consider a “clinically not relevant $\small{\Delta}$”

❝

❝ As far as we (and the Agency) proclaim 25% to be clinically not relevant there is no difference in the rate of the harm for the customer's health independently from the a- or b- approach. For the b-approach he'll just receive not worser drug or doesn't receive it at all.

Here you err. In (a) all is good. In (b) everything is in a flux; the applicant and agency agree only that the acceptable risk may be either 20% or 25%.
We are dealing with average BE. Classifying HVD(P)s based on CV_wR is fine in principle. However, once we make this classification post hoc (based on $\small{\widehat{CV_\textrm{wR}}}$), troubles start. Hence, I don’t like* the reference-scaling methods and (b) as well.

❝ ❝ Try the function CVCL() in PowerTOST.

❝ I try:

❝

library(PowerTOST)

❝ CVCL(CV = 0.3, df = 3*40-4, side ="2-sided")

❝ lower CL upper CL

❝ 0.2646219 0.3466708

❝ As CI is shifted to the right …

Skewed to the right because the variance follows a $\small{\chi^2}$-distribution.

❝ … does it mean that for these initial conditions the probability of the conclusion of HV is higher?

Yes (for any condition).

❝ (By the way shouldn't we lower the degrees of freedom for the CV of the reference drug? 3*40-3 should correspond to the common CV of the Test and Reference, shouldn't it?)

Oops, one more degree of freedom! In the 2-sequence 4-period replicate design we have df = 3n – 4 for the pooled CV_w. Following the EMA’s model for the estimation of CV_wR we have one factor (the treatment) less in the model and therefore, df = 3n – 3:

library(PowerTOST) CVCL(CV = 0.3, df = 3*40-3, side = "2-sided") lower CL upper CL 0.2647549 0.3464397

Not for an initiate like you but others:
- Such a study is not bijective like when assessed for ABE. Whereas in ABE we could reverse the procedure (if T ≈ R also R ≈ T), this is highly unlikely here (only if CV_wR ≡ CV_wT).
- In ABE every application has to follow the same rules and $\small{\Delta}$ is known. Here every study sets its own rule. The BE-limits and hence, $\small{\widehat{\Delta}}$ are random variables. Without access to the study report patients and physicians don’t know the risk.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Astea
★★

Russia,
2020-12-02 16:28
(1673 d 08:48 ago)

@ Helmut
Posting: # 22104
Views: 10,996

Dieselpunk

Post reply

Dear Helmut!

Thank you for the explanation!

❝ Here you err. In (a) all is good. In (b) everything is in a flux; the applicant and agency agree only that the acceptable risk may be either 20% or 25%.

I just meant to say that the acceptable risk of either 20% or 25% is anyway less or equal to 25%.

❝ Oops, one more degree of freedom! In the 2-sequence 4-period replicate design we have df = 3n – 4 for the pooled CV_w. Following the EMA’s model for the estimation of CV_wR we have one factor (the treatment) less in the model and therefore, df = 3n – 3:

How does this df correspond to the residual df of ANOVA for getting CV_WR? I thought that there should be only
40-2=38 degrees of freedom - because from the point of view of the reference drug the full replicate turns to standart 2-way, is it right?

—
"Being in minority, even a minority of one, did not make you mad"

Helmut
★★★

Vienna, Austria,
2020-12-02 17:18
(1673 d 07:58 ago)

@ Astea
Posting: # 22105
Views: 10,958

Dieselpunk

Post reply

Hi Nastia,

❝ How does this df correspond to the residual df of ANOVA for getting CV_WR? I thought that there should be only

❝ 40-2=38 degrees of freedom - because from the point of view of the reference drug the full replicate turns to standart 2-way, is it right?

I stand corrected!

library(PowerTOST) CVCL(CV = 0.3, df = 40-2, side = "2-sided") lower CL upper CL 0.2434049 0.3922851

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes ★★★ Berlin, Germany, 2020-12-02 20:30 (1673 d 04:46 ago) @ Helmut Posting: # 22106 Views: 10,950	Steampunk? OT Post reply
	Deatr Both, ❝ Willard Oodendijk twittered and Nemo Macron said “waouf”. forgive me old fart, but whois Mr. Oodendijk? Shold I know him? If yes, why? Enlighten me, please. Maybe I could learn something new in my old age. — Regards, Detlew

Astea ★★ Russia, 2020-12-02 20:58 (1673 d 04:18 ago) @ d_labes Posting: # 22107 Views: 10,971	Steampunk? OT Post reply
	Dear Detlew! ❝ forgive me old fart, but whois Mr. Oodendijk? Shold I know him? If yes, why? Enlighten me, please. That guy is from the neighbouring thread — "Being in minority, even a minority of one, did not make you mad"

Helmut
★★★

Vienna, Austria,
2020-12-23 13:18
(1652 d 11:59 ago)

@ wienui
Posting: # 22157
Views: 10,931

PowerTOST 1.5-2.9000 on GitHub

Post reply

Dear Osama and neds,

I updated the development version 1.5.2.9000 of PowerTOST on GitHub (it’s not on CRAN yet). If you want to give it a try:

install.packages("remotes") remotes::install_github("Detlew/PowerTOST")

Some examples in the following. Business as usual (ABE):

sampleN.TOST(CV = 0.29, theta0 = 0.9, design = "2x2x4") +++++++++++ Equivalence test - TOST +++++++++++ Sample size estimation ----------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) alpha = 0.05, target power = 0.8 BE margins = 0.8 ... 1.25 True ratio = 0.9, CV = 0.29 Sample size (total) n power 38 0.814460

Note that we are close to the switching CV_wR 30%. What about ABEL?

sampleN.scABEL(CV = 0.29, theta0 = 0.9, design = "2x2x4", regulator = "EMA", details = FALSE) +++++++++++ scaled (widened) ABEL +++++++++++ Sample size estimation (simulation based on ANOVA evaluation) --------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) 1e+05 studies for each step simulated. alpha = 0.05, target power = 0.8 CVw(T) = 0.29; CVw(R) = 0.29 True ratio = 0.9 ABE limits / PE constraint = 0.8 ... 1.25 Regulatory settings: EMA Sample size n power 34 0.8095

Lower sample size than for ABE but – as usual – we would face an inflated Type I Error of 0.07095.

Using the new argument regulator = "GCC".

sampleN.scABEL(CV = 0.29, theta0 = 0.9, design = "2x2x4", regulator = "GCC", details = FALSE) +++++++++++ scaled (widened) ABEL +++++++++++ Sample size estimation (simulation based on ANOVA evaluation) --------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) 1e+05 studies for each step simulated. alpha = 0.05, target power = 0.8 CVw(T) = 0.29; CVw(R) = 0.29 True ratio = 0.9 ABE limits / PE constraint = 0.8 ... 1.25 Regulatory settings: GCC Sample size n power 28 0.8014

Even lower sample size than for ABEL because sometimes the CV is misclassified and widened limits of 75.00–133.33% are applied. What about the Type I Error for this approach?

power.scABEL(CV = 0.29, theta0 = 1.25, design = "2x2x4", n = 28, regulator = "GCC") [1] 0.14573

Nasty. Due to the misclassification a huge inflation of the TIE (patient’s risk more than twice of ABEL).
Iteratively adjust α:

scABEL.ad(CV = 0.29, theta0 =0.9, design = "2x2x4", n = 28, regulator = "GCC") +++++++++++ scaled (widened) ABEL ++++++++++++ iteratively adjusted alpha (simulations based on ANOVA evaluation) ---------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) 1,000,000 studies in each iteration simulated. CVwR 0.29, CVwT 0.29, n(i) 14|14 (N 28) Nominal alpha : 0.05 True ratio : 0.9000 Regulatory settings : GCC (ABE) Switching CVwR : 0.3 BE limits : 0.8000 ... 1.2500 PE constraints : 0.8000 ... 1.2500 Empiric TIE for alpha 0.0500 : 0.14573 Power for theta0 0.9000 : 0.801 Iteratively adjusted alpha : 0.01102 Empiric TIE for adjusted alpha: 0.05000 Power for theta0 0.9000 : 0.602

Substantial loss in power due to evaluation by the 100(1–2×0.01102)=97.796% CI.
Increase the sample size to maintain power (show progress of iterations):

sampleN.scABEL.ad(CV = 0.29, theta0 = 0.9, design = "2x2x4", regulator = "GCC", progress = TRUE, details = TRUE) +++++++++++ scaled (widened) ABEL ++++++++++++ Sample size estimation for iteratively adjusted alpha (simulations based on ANOVA evaluation) ---------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) 1,000,000 studies in each iteration simulated. Assumed CVwR 0.29, CVwT 0.29 Nominal alpha : 0.05 True ratio : 0.9000 Target power : 0.8 Regulatory settings: GCC (ABE) Switching CVwR : 0.3 BE limits : 0.8000 ... 1.2500 PE constraints : 0.8000 ... 1.2500 Progress of each iteration: n 28, nomin. alpha: 0.05000 (power 0.8014), TIE: 0.1457 Sample size search and iteratively adjusting alpha n 28, adj. alpha: 0.01102 (power 0.6016), rel. impact on power: -24.94% n 48, adj. alpha: 0.00488 (power 0.7372) n 46, adj. alpha: 0.00529 (power 0.7274) n 48, adj. alpha: 0.00488 (power 0.7372) n 50, adj. alpha: 0.00452 (power 0.7461) n 52, adj. alpha: 0.00419 (power 0.7550) n 54, adj. alpha: 0.00389 (power 0.7629) n 56, adj. alpha: 0.00359 (power 0.7703) n 58, adj. alpha: 0.00333 (power 0.7787) n 60, adj. alpha: 0.00312 (power 0.7850) n 62, adj. alpha: 0.00289 (power 0.7924) n 64, adj. alpha: 0.00268 (power 0.7989) n 66, adj. alpha: 0.00251 (power 0.8058), TIE: 0.05000 Compared to nominal alpha's sample size increase of 135.7% (~study costs). Runtime : 79 seconds Simulations: 97,500,000

Note that the TIE depends strongly on the sample size. Hence, in every step we have to adjust α as well. OK, with 66 subjects we achieve the target power but it comes with a price, namely evaluation by a 99.498% CI…

Inspect the plots of this post again. If the true CV_wR > 30% it might misclassified as well but this time towards the conventional limits and the TIE is not inflated. Hence, we get the same sample size by

sampleN.scABEL(CV = 0.31, theta0 = 0.9, design = "2x2x4", regulator = "GCC", details = FALSE) +++++++++++ scaled (widened) ABEL +++++++++++ Sample size estimation (simulation based on ANOVA evaluation) --------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) 1e+05 studies for each step simulated. alpha = 0.05, target power = 0.8 CVw(T) = 0.31; CVw(R) = 0.31 True ratio = 0.9 ABE limits / PE constraint = 0.8 ... 1.25 Widened limits = 0.75 ... 1.333333 Regulatory settings: GCC Sample size n power 28 0.8135

and

sampleN.scABEL.ad(CV = 0.31, theta0 = 0.9, design = "2x2x4", regulator = "GCC", details = FALSE) +++++++++++ scaled (widened) ABEL ++++++++++++ Sample size estimation for iteratively adjusted alpha (simulations based on ANOVA evaluation) ---------------------------------------------- Study design: 2x2x4 (4 period full replicate) log-transformed data (multiplicative model) 1,000,000 studies in each iteration simulated. Assumed CVwR 0.31, CVwT 0.31 Nominal alpha : 0.05 True ratio : 0.9000 Target power : 0.8 Regulatory settings: GCC (ABEL) Switching CVwR : 0.3 Regulatory constant: 0.97998 Widened limits : 0.7500 ... 1.3333 PE constraints : 0.8000 ... 1.2500 n 28, nomin. alpha: 0.05000 (power 0.8135), TIE: 0.0263 No inflation of the TIE expected; hence, no adjustment of alpha required.

Inflation of the Type I error in different approaches (2-sequence, 4-period full replicate designs):

[image]

Conventional ABE with fixed limits never exceeds nominal α. Since TOST is not a most powerful test, for high CVs combined with small sample sizes, the TIE will be below nominal α. All is good.
For the EMA’s ABEL the maximum inflation occurs at CV_wR 30%. If CV_wR increases, the TIE decreases since the probability of a misclassification decreases as well. Starting with the upper scaling cap at CV_wR 50% limits are fixed and the TIE is driven by the conservatism of TOST – together with the PE-constraint. However, even for very high CVs (not shown) the TIE doesn’t exceed nominal α.
Similar for Health Canada’s ABEL though the minimum TIE is observed at it’s upper cap of ~57.38%.
For the GCC’s widened limits huge inflation of the TIE if CV_wR ≤30% (the highest of all approaches). Strong dependency on the sample size. Behaves with increasing CVs like TOST.
Huge inflation of the TIE for the FDA’s RSABE with implied limits if CV_wR <30%. Moderate to extremely conservative otherwise. That’s the model all (‼) authors (except ones of the FDA¹) considered the applicable one since products are approved according to this model and not according to f.
Lower inflation of the TIE by the FDA’s RSABE “desired consumer risk model”.¹ No more than a mathematical prestidigitation and called even “hocus pocus” by some.² Here the maximum inflation of the TIE occurs at ~25.4%.

Davit BM, Chen ML, Conner DP, Haidar SH, Kim S, Lee CH, Lionberger RA, Makhlouf FT, Nwakama PE, Patel DT, Schuirmann DJ, Yu LX. Implementation of a Reference-Scaled Average Bioequivalence Approach for Highly Variable Generic Drug Products by the US Food and Drug Administration. AAPS J. 2012: 14(4); 915–24. doi:10.1208/s12248-012-9406-x. PMC Free Full text.
Detlew Labes, László Endrényi, myself…

Edit 2021-01-18: PowerTOST 1.5-3 on CRAN.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Inflated type I error with fixed widened limits? [RSABE / ABEL]

Inflated type I error with fixed widened limits?

Inflated type I error with fixed widened limits?

Inflated type I error with fixed widened limits

Inflated type I error: Not nice

Power of the GCC framework and power of PowerTOST

Sample sizes (ignoring the inflated TIE)

Bug?

“Feature”

Inflated type I error with fixed widened limits?

Inflated type I error with fixed widened limits

Inflated type I error with fixed widened limits

Inflated type I error with fixed widened limits

Houston, we have a problem!

Paradox of tolerance

Οὐτοπεία ∨ Εὐτοπεία

Uchronia

Steampunk

Dieselpunk

Dieselpunk

Steampunk? OT

Steampunk? OT

PowerTOST 1.5-2.9000 on GitHub