Bioequivalence and Bioavailability Forum • Adaptive TSD vs. “classical” GSD

Adaptive TSD vs. “classical” GSD [Two-Stage / GS Designs]

posted by Helmut – Vienna, Austria, 2015-11-27 20:05 (3065 d 22:47 ago) – Posting: # 15680
Views: 14,753

Dear all,

I received a question and suggested the sender to register at the forum, which he didn’t do. However, I think that the question is interesting and I want to get your opinions. The study is planed for USFDA submission.

“BE study will be initiated with dosing for 50% subjects of protocol and samples will be analysed; if results with 50% subjects show bioequivalence, data will be submitted to regulatory. If results are not bioequivalent, study will be continued with dosing for remaining 50% subjects and samples will be analysed; The results with all subjects (100%) will be evaluated for BE and results show bioequivalence, data will be submitted to regulatory.”

OK, smells of a “classical” Group-Sequential Design with one interim at N/2. The best guess CV is around 40% and the expected GMR 0.95¹:

library(PowerTOST) sampleN.TOST(CV=0.4, theta0=0.95) +++++++++++ Equivalence test - TOST +++++++++++ Sample size estimation ----------------------------------------------- Study design: 2x2 crossover log-transformed data (multiplicative model) alpha = 0.05, target power = 0.8 BE margins = 0.8 ... 1.25 Null (true) ratio = 0.95, CV = 0.4 Sample size (total) n power 66 0.805252

In a TSD one would opt for a stage 1 sample size of ~75% of the fixed sample design. So we would also start with 50 – the same number chosen for the GSD. Below the results of my sim’s for CVs of 30-50% (R-code²). The ‘best guess’ CV is marked.

1. GSD
GMR CV% n alpha pwr% 2nd% N alpha pwr% TIE 0.95 30 50 0.0310 84.05 15.95 100 0.0277 98.72 0.04839 ns 0.95 31 50 0.0310 81.66 18.34 100 0.0277 98.28 0.04839 ns 0.95 32 50 0.0310 79.19 20.82 100 0.0277 97.73 0.04839 ns 0.95 33 50 0.0310 76.59 23.41 100 0.0277 97.09 0.04847 ns 0.95 34 50 0.0310 73.98 26.02 100 0.0277 96.27 0.04856 ns 0.95 35 50 0.0310 71.32 28.68 100 0.0277 95.45 0.04856 ns 0.95 36 50 0.0310 68.67 31.33 100 0.0277 94.43 0.04848 ns 0.95 37 50 0.0310 65.89 34.11 100 0.0277 93.48 0.04855 ns 0.95 38 50 0.0310 63.05 36.95 100 0.0277 92.32 0.04831 ns 0.95 39 50 0.0310 60.27 39.73 100 0.0277 91.18 0.04839 ns 0.95 40 50 0.0310 57.48 42.52 100 0.0277 89.95 0.04848 ns 0.95 41 50 0.0310 54.76 45.24 100 0.0277 88.64 0.04825 ns 0.95 42 50 0.0310 52.03 47.97 100 0.0277 87.29 0.04825 ns 0.95 43 50 0.0310 49.28 50.72 100 0.0277 85.94 0.04849 ns 0.95 44 50 0.0310 46.51 53.49 100 0.0277 84.48 0.04826 ns 0.95 45 50 0.0310 43.72 56.28 100 0.0277 83.08 0.04799 ns 0.95 46 50 0.0310 41.00 59.00 100 0.0277 81.50 0.04813 ns 0.95 47 50 0.0310 38.44 61.56 100 0.0277 79.92 0.04777 ns 0.95 48 50 0.0310 35.86 64.14 100 0.0277 78.32 0.04766 ns 0.95 49 50 0.0310 33.34 66.66 100 0.0277 76.69 0.04741 ns 0.95 50 50 0.0310 30.77 69.23 100 0.0277 75.02 0.04712 ns

[image]

No inflation of the TIE if we use Pocock’s approach with Lan/DeMets α-spending. Power is pretty high and drops below 80% only for CV > 46%.

2. ‘Type 1’ TSD
GMR CV% n1 alpha pwr% 2nd% E[N] alpha pwr% TIE 0.95 30 50 0.0302 83.74 6.62 51 0.0302 86.14 0.03192 ns 0.95 31 50 0.0302 81.33 9.85 51 0.0302 85.20 0.03298 ns 0.95 32 50 0.0302 78.79 13.58 52 0.0302 84.42 0.03437 ns 0.95 33 50 0.0302 76.18 17.59 52 0.0302 83.90 0.03594 ns 0.95 34 50 0.0302 73.56 21.53 53 0.0302 83.55 0.03756 ns 0.95 35 50 0.0302 70.87 25.45 54 0.0302 83.31 0.03924 ns 0.95 36 50 0.0302 68.18 29.13 56 0.0302 83.13 0.04093 ns 0.95 37 50 0.0302 65.36 32.70 57 0.0302 83.07 0.04238 ns 0.95 38 50 0.0302 62.51 36.19 59 0.0302 82.98 0.04353 ns 0.95 39 50 0.0302 59.69 39.44 61 0.0302 82.87 0.04469 ns 0.95 40 50 0.0302 56.90 42.55 64 0.0302 82.89 0.04578 ns 0.95 41 50 0.0302 54.13 45.52 66 0.0302 82.83 0.04632 ns 0.95 42 50 0.0302 51.37 48.42 69 0.0302 82.68 0.04717 ns 0.95 43 50 0.0302 48.60 51.26 72 0.0302 82.64 0.04797 ns 0.95 44 50 0.0302 45.82 54.10 75 0.0302 82.55 0.04843 ns 0.95 45 50 0.0302 43.01 56.94 79 0.0302 82.48 0.04893 ns 0.95 46 50 0.0302 40.31 59.65 82 0.0302 82.45 0.04909 ns 0.95 47 50 0.0302 37.73 62.25 86 0.0302 82.33 0.04977 ns 0.95 48 50 0.0302 35.16 64.83 90 0.0302 82.23 0.04949 ns 0.95 49 50 0.0302 32.58 67.41 95 0.0302 82.13 0.04975 ns 0.95 50 50 0.0302 30.02 69.98 99 0.0302 82.03 0.04963 ns

[image]

No inflation of the TIE. Power in the first stage is similar to the GSD (since alphas are similar). Overall power is more consistent and doesn’t drop below the target 80%.

Now my questions (especially @Ben). If the CV is lower than the ‘best guess’ in the GSD we have to go full throttle with another 50 subjects. Compare the column “2nd%” which gives the chance to proceed to the 2^nd part. Not only the chance is higher in the GSD, we are punished with another 50 subjects. Have a look at the TSD’s column “E[N]” giving the expected average total sample size. Much lower. Sure. Sometimes we need just a few more subjects and not another 50. Only for high CVs the TSD’s approach the GSD’s. Nice side effect: If we start the TSD in 75% of the fixed sample design’s n, on the average the total sample will be even (slightly) lower (64 < 66).
Given all that: Why should one use a GSD instead of a TSD?

Edit: I misinterpreted the question. He was talking about 50% (regardless the sample size) – not n=50.
R-code
2.1. GSD (14 seconds on my machine)
library(ldbounds) library(Power2Stage) ## More than one interim possible in GSDs. However, since ## not acceptable according to BE-GLs, only one interim ## (at arbitrary time) is implemented in Power2Stage. GMR <- 0.95 n <- c(50, 50) CV <- seq(0.3, 0.5, 0.01) cum <- vector("numeric", length=length(n)) for (j in seq_along(n)) cum[j] <- max(cum) + n[j] t <- cum/max(cum) alpha <- rep(0.05/2, 2) iuse <- rep(2, 2) bnds <- bounds(t=t, iuse=iuse, alpha=alpha) alpha <- round(2*(1-pnorm(bnds$upper.bounds)), 4) sig <- binom.test(x=0.05*1e6, n=1e6, alternative='less', conf.level=1-0.05)$conf.int[2] res <- matrix(nrow=length(CV), ncol=11, byrow=TRUE, dimnames=list(NULL, c("GMR", "CV%", "n", "alpha", "pwr%", "2nd%", "N", "alpha", "pwr%", "TIE", " "))) ptm <- proc.time() for (j in seq_along(CV)) { tmp1 <- power.2stage.GS(alpha=alpha, n=n, CV=CV[j], theta0=GMR) tmp2 <- power.2stage.GS(alpha=alpha, n=n, CV=CV[j], theta0=1.25, nsims=1e6) res[j, 1] <- sprintf("%.2f", GMR) res[j, 2] <- sprintf("%.0f", 100*CV[j]) res[j, 3] <- sprintf("%.0f", n[1]) res[j, 4] <- sprintf("%.4f", alpha[1]) res[j, 5] <- sprintf("%.2f", 100*tmp1$pBE_s1) res[j, 6] <- sprintf("%.2f", tmp1$pct_s2) res[j, 7] <- sprintf("%.0f", max(cum)) res[j, 8] <- sprintf("%.4f", alpha[2]) res[j, 9] <- sprintf("%.2f", 100*tmp1$pBE) res[j, 10] <- sprintf("%.5f", tmp2$pBE) res[j, 11] <- "ns" if (tmp2$pBE > sig) res[j, 11] <- "*" } run.time <- proc.time()-ptm cat("Runtime:",signif(run.time[3], 3), " seconds\n");print(as.data.frame(res), row.names=F) op <- par(no.readonly=TRUE) par(mfrow = c(1, 2), oma=c(0, 0, 5, 0), mar=c(4, 4, 0, 1)) plot(res[, 2], res[, 10], xlab="CV (%)", ylab="empiric Type I Error", pch=16, col="#0000FF") abline(h=c(0.05, sig), lty=c(1, 3), col=c("#0000FF", "#FF0000")) plot(res[, 2], res[, 9], xlab="CV (%)", ylab="power (%)", ylim=c(70, 100), pch=16, col="#008000") points(res[, 2], res[, 5], pch=16, col="#0000FF") abline(h=80, col="#008000") legend("topright", legend=c("at interim", "final"), pch=rep(16, 2), col=c("#0000FF", "#008000"), bty="n") main.txt <- paste0("Pocock-Lan/DeMets Group Sequential Design\n", "with one interim analysis, expected GMR: ", GMR, ".\n", "Cumulative sample sizes: ") n.txt <- "" for (j in seq_along(cum)) { if (j < length(cum)) { n.txt <- paste0(n.txt, cum[j], " (\u03b1 ", sprintf("%.4f", alpha[j]), "), ") } else { n.txt <- paste0(n.txt, cum[j], " (\u03b1 ", sprintf("%.4f", alpha[j]), ").\n") } } main.txt <- paste0(main.txt, n.txt) title(main.txt, outer=TRUE) par(op)

2.2. TSD (be patient; 7 minutes on my machine)
library(Power2Stage) GMR <- 0.95 n1 <- 50 CV <- seq(0.3, 0.5, 0.01) alpha <- rep(0.0302, 2) sig <- binom.test(x=0.05*1e6, n=1e6, alternative='less', conf.level=1-0.05)$conf.int[2] res <- matrix(nrow=length(CV), ncol=11, byrow=TRUE, dimnames=list(NULL, c("GMR", "CV%", "n1", "alpha", "pwr%", "2nd%", "E[N]", "alpha", "pwr%", "TIE", " "))) ptm <- proc.time() for (j in seq_along(CV)) { tmp1 <- power.2stage(alpha=alpha, n1=n1, CV=CV[j], theta0=GMR) tmp2 <- power.2stage(alpha=alpha, n1=n1, CV=CV[j], theta0=1.25, nsims=1e6) res[j, 1] <- sprintf("%.2f", GMR) res[j, 2] <- sprintf("%.0f", 100*CV[j]) res[j, 3] <- sprintf("%.0f", n1) res[j, 4] <- sprintf("%.4f", alpha[1]) res[j, 5] <- sprintf("%.2f", 100*tmp1$pBE_s1) res[j, 6] <- sprintf("%.2f", tmp1$pct_s2) res[j, 7] <- sprintf("%.0f", tmp1$nmean) res[j, 8] <- sprintf("%.4f", alpha[2]) res[j, 9] <- sprintf("%.2f", 100*tmp1$pBE) res[j, 10] <- sprintf("%.5f", tmp2$pBE) res[j, 11] <- "ns" if (tmp2$pBE > sig) res[j, 11] <- "*" } run.time <- proc.time()-ptm cat("Runtime:",signif(run.time[3]/60, 3), " minutes\n");print(as.data.frame(res), row.names=F) op <- par(no.readonly=TRUE) par(mfrow = c(1, 2), oma=c(0, 0, 5, 0), mar=c(4, 4, 0, 1)) plot(res[, 2], res[, 10], xlab="CV (%)", ylab="empiric Type I Error", pch=16, col="#0000FF") abline(h=c(0.05, sig), lty=c(1, 3), col=c("#0000FF", "#FF0000")) plot(res[, 2], res[, 9], xlab="CV (%)", ylab="power (%)", ylim=c(70, 100), pch=16, col="#008000") points(res[, 2], res[, 5], pch=16, col="#0000FF") abline(h=80, col="#008000") legend("topright", legend=c("at interim", "final"), pch=rep(16, 2), col=c("#0000FF", "#008000"), bty="n") main.txt <- paste0("\u2018Type 1\u2019 Adaptive Two-Stage Sequential Design,", "\nexpected GMR: ", GMR, ".\n", "Stage 1 sample size: ", n1, ", \u03b1 in both stages: ", sprintf("%.4f", alpha[1]), ".") title(main.txt, outer=TRUE) par(op)

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Adaptive TSD vs. “classical” GSDHelmut 2015-11-27 19:05 [Two-Stage / GS Designs]
- Adaptive TSD vs. “classical” GSD ElMaestro 2015-11-27 19:54
- “classical” GSD - E[n] d_labes 2015-11-30 11:15
  - Apples are pears by comparing the weight Helmut 2015-12-01 16:35
    - Apples are pears by comparing the weight d_labes 2015-12-03 09:16
      - Apples are pears by comparing the weight Helmut 2015-12-03 13:10
        
        Oranges d_labes 2015-12-03 13:56
- Adaptive TSD vs. “classical” GSD Ben 2015-12-02 19:27
  - Adaptive TSD vs. “classical” GSD Helmut 2015-12-03 03:11
    - “classical” GSD alpha's d_labes 2015-12-03 09:47
      - N sufficiently large‽ Helmut 2015-12-03 14:56
        
        An other one with 0.0304 d_labes 2015-12-03 16:15
        
        An other one with 0.0304 Helmut 2015-12-03 16:26
    - Adaptive TSD vs. “classical” GSD Ben 2016-01-10 12:43