Likely it does not work (potentially inflated Type I Error) [Two-Stage / GS Designs]

posted by Helmut Homepage – Vienna, Austria, 2023-12-19 12:10 (300 d 00:32 ago) – Posting: # 23796
Views: 3,680

Dear all,

I could not resist and had a closer look (a lengthy [image]-script at the end).

Say, we have a 2-sequence 4-period (full) replicate design and start the study in 16 subjects (n1).
We observe a CVwR of 0.30. Since swR < 0.294, we have to go with ABE (no scaling). Power based on a fixed GMR 0.95 is below the target of 0.80. Hence, we initiate a second stage. With Pocock’s adjusted α 0.0294 we recruit 6 subjects (n2). We observe a CVwR of 0.30 again and GMR 0.92 in the final analysis of pooled data.
Power will be only 0.7093 since the GMR is worse than assumed. However, the Type I Error will be significantly inflated (0.0861 > α). We would have needed at least an adjusted α of 0.0149 (which is substantially lower than the one we used) in order to control the Type I Error.

Call the script with the example’s data:

RSABE.TSD(adj = 0.0294, design = "2x2x4", n1 = 16, CVwR = 0.3, GMR = 0.95,
          target = 0.8, CVwR.2 = 0.3, GMR.2 = 0.92)
adjusted alpha : 0.0294 (Pocock’s for superiority)
design         : 2x2x4
n1             :  16
futility on N  : none
CVwR           : 0.3000 (observed)
theta1         : 0.8000 (lower implied limit)
theta2         : 1.2500 (upper implied limit)
power          : 0.6812 (estimated)
Stage 2 initiated (insufficent power in stage 1)
GMR            : 0.9500 (fixed)
target power   : 0.8000 (fixed)
n2             :   6
N              :  22; less than the FDA’s minimum of 24 subjects!
CVwR           : 0.3000 (observed)
GMR            : 0.9200 (observed)
theta1         : 0.8000 (lower implied limit)
theta2         : 1.2500 (upper implied limit)
power          : 0.7093 (estimated, may pass RSABE)
empirical TIE  : 0.0861 (all publications; significantly inflated))
An adjusted alpha of 0.0149 (or less)
would be needed to control the Type I Error.

Note that the Type I Error in RSABE depends strongly on the sample size. Hence, even if we re-estimate the sample size with an adjusted α of 0.0149 we would see an inflated Type I Error of 0.0546 due to the larger stage 2 sample size of 10 subjects. For a total sample size (N) of 26 subjects we would need an even smaller adjusted α of 0.01333…
Try a fixed GMR of 0.90 – which is more realistic for HVD(P)s – and you will be surprised.
Note also that in the RSABE-branch (swR ≥ 0.294) the empirical Type I Error drops to the adjusted α for CVwR infinitesimal greater than 30%. Example above changed to:

CVwR.2 = 0.30 + 1e-9

Part of the output:

theta1         : 0.7695 (lower implied limit)
theta2         : 1.2996 (upper implied limit)
empirical TIE  : 0.0294 (all publications)

For any larger CVwR the empirical Type I Error will be lower than the adjusted α.

Of course, increasing the sample size in stage 1 does not help in the ABE-branch (swR < 0.294).

RSABE.TSD(adj = 0.0304, design = "2x2x4", n1 = 38, ...)
adjusted alpha : 0.0304 (Pocock’s for equivalence)
design         : 2x2x4
n1             :  38
futility on N  : none
CVwR           : 0.3000 (observed)
theta1         : 0.8000 (lower implied limit)
theta2         : 1.2500 (upper implied limit)
power          : 0.9708 (estimated)
Study stopped in stage 1 (sufficient power)
empirical TIE  : 0.1122 (all publications; significantly inflated)
An adjusted alpha of 0.0100 (or less)
would be needed to control the Type I Error.


Let’s go fully adaptive, i.e., use the observed stage 1 GMR rather than a fixed one. In the final analysis the GMR is worse than in the first stage and the CVwR lower. We use some of the defaults. We have to increase our target power. That’s a guessing game because in the interim we don’t know what will happen in the second stage.

RSABE.TSD(adj = 0.0304, design = "2x2x4", n1 = 24, CVwR = 0.35, GMR = 0.9,
          usePE = TRUE, target = 0.9, CVwR.2 = 0.32, GMR.2 = 0.88)
adjusted alpha : 0.0304 (Pocock’s for equivalence)
design         : 2x2x4
n1             :  24
futility on N  : none
CVwR           : 0.3500 (observed)
theta1         : 0.7383 (lower implied limit)
theta2         : 1.3545 (upper implied limit)
power          : 0.6906 (estimated)
Stage 2 initiated (insufficent power in stage 1)
GMR            : 0.9000 (observed)
target power   : 0.9000 (fixed)
n2             :  20
N              :  44
CVwR           : 0.3200 (observed)
GMR            : 0.8800 (observed)
theta1         : 0.7568 (lower implied limit)
theta2         : 1.3214 (upper implied limit)
power          : 0.7762 (estimated, may pass RSABE)
empirical TIE  : 0.0271 (all publications


Only if you are a devout follower of the FDA church and believe in the ‘desired consumer risk model’,1 run the first example with

RSABE.TSD(adj = 0.0294, design = "2x2x4", n1 = 16, CVwR = 0.3, GMR = 0.95,
          target = 0.8, CVwR.2 = 0.3, GMR.2 = 0.92, risk = TRUE)
adjusted alpha : 0.0294 (Pocock’s for superiority)
design         : 2x2x4
n1             :  16
futility on N  : none
CVwR           : 0.3000 (observed)
theta1         : 0.8000 (lower implied limit)
                 0.7695 (lower limit of the ‘desired consumer risk model’)
theta2         : 1.2500 (upper implied limit)
                 1.2996 (upper limit of the ‘desired consumer risk model’)
power          : 0.6812 (estimated)
Stage 2 initiated (insufficent power in stage 1)
GMR            : 0.9500 (fixed)
target power   : 0.8000 (fixed)
n2             :   6
N              :  22; less than the FDA’s minimum of 24 subjects!
CVwR           : 0.3000 (observed)
GMR            : 0.9200 (observed)
theta1         : 0.8000 (lower implied limit)
                 0.7695 (lower limit of the ‘desired consumer risk model’)
theta2         : 1.2500 (upper implied limit)
                 1.2996 (upper limit of the ‘desired consumer risk model’)
power          : 0.8354 (estimated, may pass RSABE)
empirical TIE  : 0.0861 (all publications; significantly inflated)
                 0.0294 (‘desired consumer risk model’)
An adjusted alpha of 0.0149 (or less)
would be needed to control the Type I Error.

By means of Harry Potter’s magic wand, the inflation of the Type I Error apparently disappears because the null hypothesis is assessed at wider limits.2 If you don’t want to chew reference #4 in the post above, maybe the first four slides of a contribution to the discussion (5th GBHI International Workshop. Amster­dam, 28 September 2022) will help.


  1. Davit BM, Chen ML, Conner DP, Haidar SH, Kim S, Lee CH, Lionberger RA, Makhlouf FT, Nwakama PE, Patel DT, Schuirmann DJ, Yu LX. Implementation of a Reference-Scaled Average Bioequivalence Approach for Highly Variable Generic Drug Products by the US Food and Drug Administration. AAPS J. 2012; 14(4): 915–24. doi:10.1208/s12248-012-9406-x. [image] Free Full text.
  2. The FDA’s regulatory constant \(\small{\theta_\text{s}=\log_{e}(1.25)/0.25\cong0.8925742\ldots}\)
    The null hypothesis on inequivalence \(\small{\theta_0}\) is assessed in the ‘desired consumer risk model’:
    • If \(\small{s_\text{wR}\leq0.25}\) with \(\small{\theta_0=\left\{0.8000,1.2500\right\}}\)
    • If \(\small{s_\text{wR}>0.25}\) with \(\small{\theta_0=\exp(\mp\theta_\text{s}\times s_\text{wR})}\)
    That’s different to the ‘implied limits’ assessed in all (‼) other publications:
    • If \(\small{CV_\text{wR}\leq30\%}\) with \(\small{\theta_0=\left\{0.8000,1.2500\right\}}\)
    • If \(\small{CV_\text{wR}>30\%}\) with \(\small{\theta_0=\exp(\mp\theta_\text{s}\times s_\text{wR})}\)
    A simple script to calculate the null hypotheses:
    • nulls <- function(CVwR, risk = FALSE) { # null hypotheses in RSABE
        theta.s <- log(1.25) / 0.25           # regulatory constant
        swR     <- sqrt(log(CVwR^2 + 1))      # within-subject standard deviation of R
        if (risk) {                           # ‘desired consumer risk model’
          if (swR <= 0.25) {
            thetas <- c(0.8, 1.25)
          } else {
            thetas <- exp(c(-1, +1) *  theta.s * swR)
          }
        } else {                              # ‘implied limits’
          if (CVwR <= 0.3) {
            thetas <- c(0.8, 1.25)
          } else {
            thetas <- exp(c(-1, +1) *  theta.s * swR)
          }
        }
        names(thetas) <- c("H0.1", "H0.2")
        return(thetas)
      }

      Examples (CVwR = 0.27, swR ≈ 0.2652645…):
      nulls(CVwR = 0.27, risk = TRUE)
           H0.1      H0.2
      0.7891741 1.2671474

      nulls(CVwR = 0.27, risk = FALSE)
      H0.1 H0.2
      0.80 1.25


RSABE.TSD <- function(adj = 0.0294, design = "2x2x4", n1, CVwR, GMR = 0.9, target = 0.8,
                      usePE = FALSE, nmax = Inf, final = TRUE, CVwR.2, GMR.2 = 0.9,
                      risk = FALSE, details = TRUE) {
  # adj    : adjusted alpha (stage 1 and final analysis) like Potvin ‘Method B’
  # design : "2x2x2": 2-sequence 4-period full replicate
  #          "2x2x3": 2-sequence 4-period full replicate
  #          "2x3x3": 3-sequence 2-period partial replicate
  # n1     : stage 1 sample size
  # CvwR   : within-subject CV of R in stage 1
  # GMR    : T/R-ratio in stage 1
  # usePE  : FALSE: use the fixed GMR, TRUE: use the observed GMR
  # nmax   : futility on total sample size
  # final  : TRUE : final analysis (requires CVwR.2 and GMR.2)
  #          FALSE: interim analyis only
  # risk   : FALSE: TIE acc. to all publications based on the ‘implied limits’
  #          TRUE : additionallly TIE acc. to the ‘desired consumer risk model’
  # details: TRUE : output to the console
  #          FALSE: data.frame of results

  if (!design %in% c("2x2x4", "2x2x3", "2x3x3"))
    stop ("design ", design, " not supported.")
  if (missing(n1)) stop ("n1 must be given.")
  if (missing(CVwR)) stop ("CVwR must be given.")
  if (GMR <= 0.8 | GMR >= 1.25) stop ("GMR must be within 0.8 – 1.25.")
  if (nmax <= n1) stop ("nmax <= n1 does not make sense.")
  if (target <= 0.5 | target >= 1)
    stop ("target ", target, " does not make sense.")
  if (final) {
    if (missing(CVwR.2)) stop ("CVwR.2 must be given.")
    if (missing(GMR.2)) stop ("GMR.2 must be given.")
  }
  suppressMessages(require(PowerTOST))             # ≥1.5-4 (2022-02-21)
  limits <- function(CVwR, risk = FALSE) {         # limits
    thetas <- scABEL(CV = CVwR, regulator = "FDA") # implied
    if (risk) {                                    # ‘desired consumer risk model’
      swR <- CV2se(CVwR)
      if (swR > 0.25) {
        thetas <- setNames(exp(c(-1, +1) * log(1.25) / 0.25 * swR),
                           c("lower", "upper"))
      } else {
        thetas <- setNames(c(0.8, 1.25), c("lower", "upper"))
      }
    }
    return(thetas)
  }
  power <- function(alpha = 0.05, CVwR, GMR, n, design) {
    return(power.RSABE(alpha = alpha, CV = CVwR, theta0 = GMR,
                       n = n, design = design))
  }
  TIE <- function(alpha = 0.05, CVwR, n, design, risk) {
    return(power.RSABE(alpha = alpha, CV = CVwR, n = n,
                       theta0 = limits(CVwR, risk)[["upper"]],
                       design = design, nsims = 1e6))
  }
  TIE.1.1 <- TIE.1.2 <- TIE.2.1 <- TIE.2.2 <- NA
  pwr.1  <- power(adj, CVwR, GMR, n = n1, design)
  sig    <- binom.test(0.05 * 1e6, 1e6, alternative = "less",
                       conf.level = 0.95)$conf.int[2]
  txt    <- paste("adjusted alpha :", sprintf("%.4f", adj))
  if (adj == 0.0294) {
    txt <- paste(txt, "(Pocock’s for superiority)")
  } else {
    if (adj == 0.0304) {
      txt <- paste(txt, "(Pocock’s for equivalence)")
    } else {
      if (adj == 0.0250) {
        txt <- paste(txt, "(Bonferroni’s for two tests)")
      } else {
        txt <- paste(txt, "(custom)")
      }
    }
  }
  txt    <- paste(txt, "\ndesign         :", design,
                  "\nn1             :", sprintf("%3.0f", n1))
  if (nmax < Inf) {
    txt <- paste(txt, "\nfutility on N  :", sprintf("%3.0f", nmax))
  } else {
    txt <- paste(txt, "\nfutility on N  : none")
  }
  txt    <- paste(txt, "\nCVwR           :", sprintf("%.4f (observed)", CVwR),
                  "\ntheta1         :", sprintf("%.4f (lower implied limit)",
                                                limits(CVwR)[["lower"]]))
  if (risk) {
    txt <- paste(txt, "\n                ",
                 sprintf("%.4f (lower limit of the ‘desired consumer risk model’)",
                         limits(CVwR, risk)[["lower"]]))
  }
  txt    <- paste(txt, "\ntheta2         :", sprintf("%.4f (upper implied limit)",
                                                     limits(CVwR)[["upper"]]))
  if (risk) {
    txt <- paste(txt, "\n                ",
                 sprintf("%.4f (upper limit of the ‘desired consumer risk model’)",
                         limits(CVwR, risk)[["upper"]]))
  }
  txt    <- paste(txt, "\npower          :", sprintf("%.4f (estimated)", pwr.1))
  if (pwr.1 >= target) { # stop in the interim
    TIE.1.1 <- TIE(adj, CVwR, n1, design, risk = FALSE)
    txt     <- paste0(txt, "\nStudy stopped in stage 1 (sufficient power)",
                      "\nempirical TIE  :", sprintf(" %.4f", TIE.1.1),
                      " (all publications")
    if (TIE.1.1 > sig) {
      txt <- paste0(txt, "; significantly inflated)")
    } else {
      txt <- paste0(txt, ")")
    }
    if (risk) {
      TIE.1.2 <- TIE(adj, CVwR, n1, design, risk = TRUE)
      txt     <- paste(txt, "\n                ", sprintf("%.4f", TIE.1.2),
                       "(‘desired consumer risk model’")
      if (TIE.1.2 > sig) {
        txt <- paste0(txt, "; significantly inflated)")
      } else {
        txt <- paste0(txt, ")")
      }
    }
    if (TIE.1.1 > sig) {
      req <- scABEL.ad(alpha.pre = adj, theta0 = GMR, CV = CVwR,
                       design = design, regulator = "FDA", n = n1,
                       print = FALSE, details = FALSE)[["alpha.adj"]]
      txt <- paste(txt, "\nAn adjusted alpha of", sprintf("%.4f", req),
                   "(or less)\nwould be needed to control the Type I Error.")
    }
  } else {               # initiate stage 2
    N       <- sampleN.RSABE(alpha = adj, CV = CVwR, theta0 = GMR,
                             targetpower = target, design = design,
                             print = FALSE, details = FALSE)[["Sample size"]]
    if (N > nmax) {
      txt <- paste(txt, "\nStage 2 not initiated",
                        "(insufficent power in stage 1\nbut total sample size",
                        N, "above futility limit)")
    } else {
      if (final) {
        pwr.2 <- power(adj, CVwR.2, GMR.2, n = N, design)
        if (GMR.2 <= 0.8 | GMR.2 >= 1.25) {
          final.est <- FALSE
        } else {
          final.est <- TRUE
          TIE.2.1 <- TIE(adj, CVwR.2, N, design, risk = FALSE)
          if (risk) TIE.2.2 <- TIE(adj, CVwR.2, N, design, risk = TRUE)
        }
      } else {
        CVwR.2 <- GMR.2 <- pwr.2 <- TIE.2.1 <- TIE.2.2 <- NA
        theta1.1 <- theta1.2 <- req <- NA
      }
        txt   <- paste(txt, "\nStage 2 initiated (insufficent power in stage 1)",
                       "\nGMR            :", sprintf("%.4f", GMR))
        ifelse (usePE, txt <- paste(txt, "(observed)"),
                       txt <- paste(txt, "(fixed)"))
        txt   <- paste(txt, "\ntarget power   :", sprintf("%.4f (fixed)", target),
                       "\nn2             :", sprintf("%3.0f", N - n1),
                       "\nN              :", sprintf("%3.0f", N))
        if (N < 24) txt <- paste0(txt, "; less than the FDA’s minimum of 24 subjects!")
      if (final) {
        txt   <- paste(txt, "\nCVwR           :", sprintf("%.4f (observed)", CVwR.2),
                       "\nGMR            :", sprintf("%.4f (observed)", GMR.2),
                       "\ntheta1         :", sprintf("%.4f (lower implied limit)",
                                                     limits(CVwR.2)[["lower"]]))
        if (risk) {
          txt <- paste(txt, "\n                ",
                       sprintf("%.4f (lower limit of the ‘desired consumer risk model’)",
                               limits(CVwR.2, risk)[["lower"]]))
        }
        txt   <- paste(txt, "\ntheta2         :",
                       sprintf("%.4f (upper implied limit)", limits(CVwR.2)[["upper"]]))
        if (risk) {
          txt <- paste(txt, "\n                ",
                       sprintf("%.4f (upper limit of the ‘desired consumer risk model’)",
                               limits(CVwR.2, risk)[["upper"]]))
        }
        txt <- paste(txt, "\npower          :", sprintf("%.4f (estimated,", pwr.2))
        ifelse (pwr.2 < 0.5, txt <- paste(txt, "fails RSABE)"),
                             txt <- paste(txt, "may pass RSABE)"))
        if (final.est) {
          txt <- paste0(txt, "\nempirical TIE  :", sprintf(" %.4f", TIE.2.1),
                        " (all publications")
          if (TIE.2.1 > sig) {
            txt <- paste0(txt, "; significantly inflated)")
          } else {
            txt <- paste0(txt, ")")
          }
          if (risk) {
            txt <- paste(txt, "\n                ", sprintf("%.4f", TIE.2.2),
                                 "(‘desired consumer risk model’")
            if (TIE.2.2 > sig) {
              txt <- paste0(txt, "; significantly inflated)")
            } else {
              txt <- paste0(txt, ")")
            }
          }
          if (TIE.2.1 > sig) {
            req <- scABEL.ad(alpha.pre = adj, theta0 = GMR.2, CV = CVwR.2,
                             design = design, regulator = "FDA", n = N,
                             print = FALSE, details = FALSE)[["alpha.adj"]]
            txt <- paste(txt, "\nAn adjusted alpha of", sprintf("%.4f", req),
                         "(or less)\nwould be needed to control the Type I Error.")
          }
        }
      }
    }
  }
  if (details) { # output to the console
    cat(txt, "\n")
  } else {       # data.frame of results
    # limits in stage 1
    L.1.1 <- limits(CVwR, FALSE)[["lower"]]
    U.1.1 <- limits(CVwR, FALSE)[["upper"]]
    L.1.2 <- U.1.2 <- NA
    if (risk) {
      L.1.2 <- limits(CVwR, TRUE)[["lower"]]
      U.1.2 <- limits(CVwR, TRUE)[["upper"]]
    }
    if (final) {
      # limits in the final analysis
      L.2.1 <- limits(CVwR.2, FALSE)[["lower"]]
      U.2.1 <- limits(CVwR.2, FALSE)[["upper"]]
      L.2.2 <- U.2.2 <- NA
      if (risk) {
        L.2.2 <- limits(CVwR.2, TRUE)[["lower"]]
        U.2.2 <- limits(CVwR.2, TRUE)[["upper"]]
      }
    } else {
      L.2.1 <- U.2.1 <- L.2.2 <- U.2.2 <- NA
    }
    result <- data.frame(alpha.adj = adj, design = design, n1 = n1, CVwR = CVwR,
                         GMR = GMR, usePE = usePE, nmax = nmax, risk.model = risk,
                         L.1.1 = L.1.1, U.1.1 = U.1.1,
                         L.1.2 = L.1.2, U.1.2 = U.1.2, power.1 = pwr.1,
                         TIE.1.1 = TIE.1.1, TIE.1.2 = TIE.1.2,
                         n2 = N - n1, N = N, CVwR.2 = CVwR.2, GMR.2 = GMR.2,
                         L.2.1 = L.2.1, U.2.1 = U.2.1,
                         L.2.2 = L.2.2, U.2.2 = U.2.2,
                         power.2 = pwr.2, TIE.2.1 = TIE.2.1, TIE.2.2 = TIE.2.2,
                         alpha.req = req)
    result <- result[, colSums(is.na(result)) < nrow(result)]
    return(result)
  }
}


Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
Activity
 Admin contact
23,253 posts in 4,886 threads, 1,668 registered users;
40 visitors (1 registered, 39 guests [including 6 identified bots]).
Forum time: 13:43 CEST (Europe/Vienna)

If you want a new idea, read an old book.    Ivan Pavlov

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5