Inflated type I error with fixed widened limits [RSABE / ABEL]

posted by Helmut Homepage – Vienna, Austria, 2020-11-30 00:14 (142 d 11:32 ago) – Posting: # 22096
Views: 1,815

Dear Detlew & Osama,

» » ... how could this type I error inflation happen?

» May be Helmut has a better (paedogogical) explanation because he is an exceptionally gifted preacher :cool:.

Only you say so. ;-) I’ll try.

Let’s start with two power curves (first [image]-script at the end). The 2-sequence 4-period replicate study is designed for ABE with conventional limits, 80% power. Since we are wary, we assume a \(\small{\theta_0}\) (T/R-ratio) of 0.90 and estimate a sample size of 40. One power curve for the conventional limits as planned (──) and one for the widened limits (──).

[image]

The probability of a type I error (i.e., the consumer risk) to be treated with a formulation which is not BE (true \(\small{\theta_0}\) outside the limits) is ≤5%. You see that the blue power curve intersects 0.05 at \(\small{\theta_0=0.80}\) and \(\small{\theta_0=1.25}\). That means also the chance of falsely passing BE is 5%. The same is applicable to South Africa’s approach where we could use pre-specified widened limits independent from the observed CV. Then we have a TIE of 0.05 as well (the red power curve intersects 0.05 at \(\small{\theta_0=0.75}\) and \(\small{\theta_0=1.\dot{3}}\)).

Imagine an even more extreme case than the one Detlew mentioned. We observe in the study a CVwR of 30.01% although the true one is 30%. That means we will use the widened limits of 75.00–133.33% instead of the correct 80.00–125.00%. With the same \(\small{\theta_0=0.80}\) and \(\small{\theta_0=1.25}\) we jump to the red power curve and therefore, have a much higher chance of passing (~40% instead of 5%).

Now for the tricky part (sorry): In the previous posts we estimated an inflation of the TIE of ~20%. Why not 40%? Say, the true CVwR is 30% and acc. to the convention the drug is considered to be not highly variable. Hence, we would have to apply the conventional limits. There is a ~50% chance that in the actual study we will observe a CVwR >30% and misclassify the drug/formulation as highly variable and apply the widened limits. But there is also a ~50% chance that we observe a CVwR ≤30% and use the conventional limits. Both together gives the ~20% inflation (actually it is more complicated*).

What if we estimate the sample size for the GCC’s approach? With 28 it will be lower (second [image]-script).

[image]

Since with a CV of 30% we have to use the conventional limits (power at \(\small{\theta_0=0.80}\) and \(\small{\theta_0=1.25}\) will be ~0.16) and we felt into the trap of an inflated type I error. Note that the TIE depends also on the sample size. Hence, it will be smaller than with 40 subjects.

If you think about iteratively adjusted α like for the reference-scaling methods (third [image]-script) – examples for the 2x2x4 design (sample sizes estimated for the GCC’s approach):

CV = 0.150, n = 12 (power = 0.8496)
TIE = 0.050148: inflated (α-adjustment necessary).
 2 iterations: adjusted α = 0.049854 (90.029% CI), TIE = 0.05 (power = 0.8493).
CV = 0.200, n = 18 (power = 0.8015)
TIE = 0.050910: inflated (α-adjustment necessary).
 7 iterations: adjusted α = 0.049082 (90.184% CI), TIE = 0.05 (power = 0.7988).
CV = 0.250, n = 28 (power = 0.8220)
TIE = 0.072558: inflated (α-adjustment necessary).
 5 iterations: adjusted α = 0.032075 (93.585% CI), TIE = 0.05 (power = 0.7644).
CV = 0.300, n = 28 (power = 0.8079)
TIE = 0.164829: inflated (α-adjustment necessary).
 8 iterations: adjusted α = 0.009071 (98.186% CI), TIE = 0.05 (power = 0.5859).
CV = 0.301, n = 28 (power = 0.8085)
TIE = 0.022390: not inflated (no α-adjustment necessary).


However, we could face a substantial loss in power (for CV 30% and the adjusted α of ~0.0091 it would drop from 81% to 59%).

[image]



1. Power-curves for fixed limits

library(PowerTOST)
design <- "2x2x4" # any replicate - see known.designs()
CV     <- 0.30
target <- 0.80
theta0 <- 0.90
n      <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target,
                       design = design, details = FALSE,
                       print = FALSE)[["Sample size"]]
theta0 <- c(theta0, seq(0.75*0.96, 1, length.out = 60))
theta0 <- sort(unique(c(theta0, 0.8, 1, 1.25, 1/theta0)))
res    <- data.frame(theta0 = theta0)
for (j in seq_along(theta0)) {
  res$pwr1[j] <- power.TOST(CV = CV, n = n, theta0 = theta0[j],
                            theta1 = 0.80, design = design)
  res$pwr2[j] <- power.TOST(CV = CV, n = n, theta0 = theta0[j],
                            theta1 = 0.75, design = design)
}
if (is.null(attr(dev.list(), "names")))
  windows(width = 4.5, height = 3.3, record = TRUE)
op  <- par(no.readonly = TRUE)
par(mar = c(4, 4, 0, 0) + 0.1, cex.axis = 0.8)
plot(log(theta0), res$pwr1, type = "n", axes = FALSE, yaxs = "i",
     xlim = log(c(0.75, 1/0.75)), ylim = c(0, 1.04),
     xlab = expression(theta[0]), ylab = "power")
axis(2, las = 1)
axis(2, at = 0.05, las = 1)
axis(1, at = log(c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)),
     labels = sprintf("%.2f", c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)))
abline(h = pretty(c(0, 1)), lty = 3, col = "lightgrey")
abline(h = 0.05, lty = 2)
abline(v = log(c(0.9, 1, 1.1, 1.2)), lty = 3, col = "lightgrey")
abline(v = log(c(0.75, 0.8, 1.25, 1/0.75)), lty = 2)
box()
lines(log(theta0), res$pwr1, lwd = 3, col = "blue")
lines(log(theta0), res$pwr2, lwd = 3, col = "red")
par(op)


2. Power curves for sample sizes acc. to the GCC’s approach

library(PowerTOST)
design <- "2x2x4" # only "2x2x4", "2x2x3", "2x3x3" implemented!
CV     <- 0.30
target <- 0.80
theta0 <- 0.90
reg    <- reg_const("user", r_const = log(1/0.75)/CV2se(0.3),
                    CVswitch = 0.3, CVcap = 0.3, pe_const = TRUE)
n      <- sampleN.scABEL(CV = CV, theta0 = theta0, targetpower = target,
                         design = design, regulator = reg, details = FALSE,
                         print = FALSE)[["Sample size"]]
theta0 <- c(theta0, seq(0.75*0.96, 1, length.out = 60))
theta0 <- sort(unique(c(theta0, 0.8, 1, 1.25, 1/theta0)))
res    <- data.frame(theta0 = theta0)
for (j in seq_along(theta0)) {
  res$pwr[j] <- power.scABEL(CV = CV, n = n, theta0 = theta0[j],
                             theta1 = 0.80, design = design,
                             regulator = reg)
}
if (is.null(attr(dev.list(), "names")))
  windows(width = 4.5, height = 3.3, record = TRUE)
op  <- par(no.readonly = TRUE)
par(mar = c(4, 4, 0, 0) + 0.1, cex.axis = 0.8)
plot(log(theta0), res$pwr, type = "n", axes = FALSE, yaxs = "i",
     xlim = log(c(0.75, 1/0.75)), ylim = c(0, 1.04),
     xlab = expression(theta[0]), ylab = "power")
axis(2, las = 1)
axis(2, at = 0.05, las = 1)
axis(1, at = log(c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)),
     labels = sprintf("%.2f", c(0.75, seq(0.8, 1.2, 0.1), 1.25, 1/0.75)))
abline(h = pretty(c(0, 1)), lty = 3, col = "lightgrey")
abline(h = 0.05, lty = 2)
abline(v = log(c(0.9, 1, 1.1, 1.2)), lty = 3, col = "lightgrey")
abline(v = log(c(0.8, 1.25)), lty = 2)
if (CV > 0.3) abline(v = log(c(0.75, 1/0.75)), lty = 2)
box()
lines(log(theta0), res$pwr, lwd = 3, col = "blue")
par(op)


3. Iteratively adjusted α for the GCC’s approach

library(PowerTOST)
opt <- function(x) {
  if (sdsims) {
    power.scABEL.sds(alpha = x, CV = CV, n = n, theta0 = U, design = design,
                     regulator = reg, nsims = 1e6, setseed = setseed,
                     progress = FALSE) - alpha
  } else {
    power.scABEL(alpha = x, CV = CV, n = n, theta0 = U, design = design,
                 regulator = reg, nsims = 1e6, setseed = setseed) - alpha
  }
}
design  <- "2x2x4" # only "2x2x4", "2x2x3", "2x3x3" implemented!
alpha   <- 0.05    # nominal level of the test
CV      <- 0.30    # can be a 2-element vector: CV[1] for T, CV[2] for R
theta0  <- 0.90    # assumed T/R-ratio
target  <- 0.80    # target (desired) power
setseed <- TRUE    # for reproducibility
sdsims  <- FALSE   # set to TRUE for partial replicate (much slower)
reg     <- reg_const("user", r_const = log(1/0.75)/CV2se(0.3),
                     CVswitch = 0.3, CVcap = 0.3, pe_const = TRUE)
# power and sample size for the GCC’s approach
x       <- sampleN.scABEL(CV = CV, theta0 = theta0, design = design,
                          targetpower = target, regulator = reg,
                          details = FALSE, print = FALSE)
power   <- x[["Achieved power"]]
n       <- x[["Sample size"]]
U       <- scABEL(CV = CV, regulator = reg)[["upper"]]
power   <- power.TOST(CV = CV, n = n, theta0 = theta0, design = design)
if (!sdsims) {     # simulate underlying statistics
  TIE <- power.scABEL(CV = CV, n = n, theta0 = U, design = design,
                      regulator = reg, nsims = 1e6)
} else {           # simulate subject data
  TIE <- power.scABEL.sds(CV = CV, n = n, theta0 = U, design = design,
                          regulator = reg, nsims = 1e6, progress = FALSE)
}
txt <- paste0("CV = ", sprintf("%.3f", CV), ", n = ", n,
              " (power = ", sprintf("%.4f)", power))
if (TIE <= alpha) {
  txt <- paste0(txt, "\nTIE = ", sprintf("%.6f", TIE), ": not inflated ",
                "(no \u03B1-adjustment necessary).\n")
} else {
  txt <- paste0(txt, "\nTIE = ", sprintf("%.6f", TIE), ": inflated ",
                "(\u03B1-adjustment necessary).")
  x   <- uniroot(opt, interval = c(0, alpha), tol = 1e-8)
  alpha.adj <- x$root
  TIE.adj   <- alpha + x$f.root
  power.adj <- power.TOST(alpha = alpha.adj, CV = CV, n = n, theta0 = theta0,
                          design = design)
  txt <- paste0(txt, "\n", sprintf("%2i", x$iter), " iterations: ",
                "adjusted \u03B1 = ", sprintf("%.6f", alpha.adj), " (",
                sprintf("%.3f%%", 100*(1-2*alpha.adj)), " CI), TIE = ",
                TIE.adj, " (power = ", sprintf("%.4f)", power.adj), ".\n")
}
cat(txt)




Edit: We implemented regulator = "GCC" in version 1.5-3 (2021-01-18) of PowerTOST. Example:

library(PowerTOST)
design  <- "2x2x4
CV      <- 0.30
theta0  <- 0.90
target  <- 0.80
n       <- sampleN.scABEL(CV = CV, theta0 = theta0, design = design,
                          targetpower = target, regulator = "GCC",
                          details = FALSE, print = FALSE)[["Sample size"]]
scABEL.ad(CV = CV, theta0 = theta0, design = design, n = n,
          regulator = "GCC", details = TRUE)

+++++++++++ scaled (widened) ABEL ++++++++++++
         iteratively adjusted alpha
   (simulations based on ANOVA evaluation)
----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1,000,000 studies in each iteration simulated.

CVwR 0.3, CVwT 0.3, n(i) 14|14 (N 28)
Nominal alpha                 : 0.05
True ratio                    : 0.9000
Regulatory settings           : GCC (ABE)
Switching CVwR                : 0.3
BE limits                     : 0.8000 ... 1.2500
PE constraints                : 0.8000 ... 1.2500
Empiric TIE for alpha 0.0500  : 0.16483 (rel. change of risk: +230%)
Power for theta0 0.9000       : 0.808
Iteratively adjusted alpha    : 0.00907
Empiric TIE for adjusted alpha: 0.05000
Power for theta0 0.9000       : 0.586 (rel. impact: -27.5%)

Runtime    : 6.6 seconds
Simulations: 8,100,000 (7 iterations)

The impact on power is massive. Which sample size would we need to maintain the target power?

sampleN.scABEL.ad(CV = CV, theta0 = theta0, design = design,
                  targetpower = target, regulator = "GCC",
                  details = TRUE)

+++++++++++ scaled (widened) ABEL ++++++++++++
            Sample size estimation
        for iteratively adjusted alpha
   (simulations based on ANOVA evaluation)
----------------------------------------------
Study design: 2x2x4 (4 period full replicate)
log-transformed data (multiplicative model)
1,000,000 studies in each iteration simulated.

Assumed CVwR 0.3, CVwT 0.3
Nominal alpha      : 0.05
True ratio         : 0.9000
Target power       : 0.8
Regulatory settings: GCC (ABE)
Switching CVwR     : 0.3
BE limits          : 0.8000 ... 1.2500
PE constraints     : 0.8000 ... 1.2500

n  28, nomin. alpha: 0.05000 (power 0.8079), TIE: 0.1648

Sample size search and iteratively adjusting alpha
n  28,   adj. alpha: 0.00907 (power 0.5859), rel. impact on power: -27.48%
n  48,   adj. alpha: 0.00343 (power 0.7237)
n  46,   adj. alpha: 0.00376 (power 0.7136)
n  48,   adj. alpha: 0.00343 (power 0.7237)
n  50,   adj. alpha: 0.00313 (power 0.7330)
n  52,   adj. alpha: 0.00283 (power 0.7402)
n  54,   adj. alpha: 0.00258 (power 0.7490)
n  56,   adj. alpha: 0.00233 (power 0.7554)
n  58,   adj. alpha: 0.00215 (power 0.7641)
n  60,   adj. alpha: 0.00198 (power 0.7703)
n  62,   adj. alpha: 0.00180 (power 0.7789)
n  64,   adj. alpha: 0.00164 (power 0.7851)
n  66,   adj. alpha: 0.00152 (power 0.7909)
n  68,   adj. alpha: 0.00138 (power 0.7958)
n  70,   adj. alpha: 0.00126 (power 0.8010), TIE: 0.05000
Compared to nominal alpha's sample size increase of 150.0% (~study costs).

Runtime    : 96.4 seconds
Simulations: 120,700,000

Oops! Since the TIE depends on the sample size itself (see the plots in this post), we have to adjust more.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Activity
 Admin contact
21,419 posts in 4,475 threads, 1,510 registered users;
online 11 (0 registered, 11 guests [including 4 identified bots]).
Forum time: Wednesday 12:47 CEST (Europe/Vienna)

In the Middles Ages the lingua franca of science was Latin.
Nowadays the language of science is bad English.    Anonymous

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5