Exploring package adaptIVPT, function rss() [Two-Stage / GS Designs]
I was wrong in this post; the adjusted α can be specified indeed.
By trial and error (the documentation is not particularly helpful) I discovered that only a 2×2×4 full replicate design is implemented. Contrary to the functions of
PowerTOST
unequal variances of T and R are not supported. The total sample size (N) can be an odd number and has to be rounded up in order to obtain balanced sequences.I could reproduce the examples of my yesterday’s post. -script at the end.
The first example:
RSS(n1 = 16, CVwR = 0.3, GMR = 0.95)
Sample size re-estimation by package ‘adaptIVPT’;
only the 2x2x4 full replicate design is supported!
adjusted alpha : 0.0294
n1 : 16
futility on N : 100
CVwR : 0.3000 (observed)
swR : 0.2936 (observed)
GMR : 0.9500 (fixed)
target power : 0.8000 (fixed)
Stage 2 initiated (insufficent power in stage 1)
n2 : 6
N : 22; less than the FDA’s minimum of 24 subjects!
However, it is not possible to estimate power and the empirical Type I Error.
By ‘hand’ with yesterday’s GMR 0.92 in the final analysis:
library(PowerTOST)
x <- setNames(c(power.RSABE(alpha = 0.0294, CV = 0.3, n = 22, design = "2x2x4", theta0 = 0.92),
power.RSABE(alpha = 0.0294, CV = 0.3, n = 22, design = "2x2x4",
theta0 = scABEL(CV = 0.3, regulator = "FDA")[["upper"]],
nsims = 1e6)), c("power", "TIE"))
print(round(x, 4))
power TIE
0.7093 0.0861
Verdict:
- Only the 2-sequence 4-period full replicate design is implemented. That’s a substantial limitation.
- Equal variances of T and R have to be assumed.
- Faster than
sampleN.RSABE()
. However, its measured runtime is about two magnitudes longer than the one reported in the list of results… Not all that glitters is gold.
- The method has a futility criterion
nmax
(default 100) on the total sample size, i.e., a re-estimatedN
>nmax
is considered a failure in stage 1. If you want no futility like in the original Potvin methods, you can’t setnmax = Inf
as in the packagePower2Stage
. Set it to a large value instead.
- It’s impossible to estimate power (in stage 1 and the final analysis) and the empirical Type I Error directly. Therefore,
power.RSABE()
is still required.
- Results are not reproducible (see the comment below).
I discovered that some tweaks are needed. It’s bad coding practice in simulations that the seed of the PRNG is not fixed (obviously it’s random). The user should have the option to use a random seed to assess reproducibility (hence, in the functions of PowerTOST
and Power2Stage
the argument setseed
). Working with seeds in parallel computing requires special attention. However, I saw no improvement with only one core. Still some irreproducibility with 105 simulations (the function’s default are only 1,000). With 106 results are stable but that’s brute force.
In the script I opted for multiple calls of rss()
and use the maximum of the re-estimated stage 2 sample sizes.
Why the heck do we have to debug code – for free! – which was developed in a project sponsored by the FDA?
If debugging is the process of removing bugs,
then programming must be the process of putting them in. Edsger W. Dijkstra
RSS <- function(adj = 0.0294, n1, CVwR, GMR = 0.95, nmax = 100, target = 0.8,
calls = 10, details = TRUE, inter = FALSE) {
suppressMessages(require(adaptIVPT))
up2even <- function(x) { 2 * (x %/% 2 + as.logical(x %% 2)) }
if (missing(n1)) stop ("n1 must be given.")
if (missing(CVwR)) stop ("CVwR mst be given.")
if (GMR <= 0.8 | GMR >= 1.25) stop ("GMR must be within 0.8-1.25.")
if (nmax <= n1) stop ("nmax <= n1 does not make sense.")
if (target <= 0.5 | target >= 1) stop ("target ", target, " does not make sense.")
if (calls <= 1) stop ("calls must be at > 1 (at least 5 recommended).")
swR <- sqrt(log(CVwR^2 + 1))
n2 <- integer(calls)
params <- list(GMR = GMR, sig_level = adj, nmax = nmax, target_power = target)
ncores <- NULL
if (!nchar(system.file(package = "parallel")) == 0) {
cores <- parallel::detectCores()
ncores <- cores - 1
}
txt <- paste("Sample size re-estimation by package ‘adaptIVPT’;",
"\nonly the 2x2x4 full replicate design is supported!",
sprintf("\nadjusted alpha : %.4f", adj),
sprintf("\nn1 : %2.0f", n1),
"\nfutility on N : nmax),
sprintf("\nCVwR : %.4f (observed)", CVwR),
sprintf("\nswR : %.4f (observed)", swR),
sprintf("\nGMR : %.4f (fixed)", GMR),
sprintf("\ntarget power : %.4f (fixed)", target))
# workaround because of random seeds (irreproducible results)
for (j in seq_along(n2)) {
res <- unlist(rss(n = n1, r = 2, S_WR = swR, params = params,
ncores = ncores)) # r must be 2
N <- up2even(res[["rss"]]) # might be odd
n2[j] <- N - n1
}
if (inter) { # show intermediate n2 estimates
ns <- data.frame(n2 = sort(unique(n2)), times = NA_integer_)
for (j in 1:nrow(ns)) {
ns$times[j] <- length(which(n2 == ns$n2[j]))
}
print(ns, row.names = FALSE); cat("\n")
}
n2 <- max(n2)
N <- n1 + n2
if (N == nmax) {
txt <- paste(txt, sprintf("\nN : %2.0f >= nmax (%.0f)",
N, nmax), "\nStudy stopped in stage 1 for futility.\n")
} else {
if (N > n1) {
txt <- paste(txt, "\nStage 2 initiated (insufficent power in stage 1)",
sprintf("\nn2 : %2.0f", n2),
sprintf("\nN : %2.0f", N))
if (N < 24) {
txt <- paste0(txt, "; less than the FDA’s minimum of 24 subjects!\n")
} else {
txt <- paste0(txt, "\n")
}
} else {
txt <- paste(txt, "\nStudy stopped in stage 1 (sufficient power)\n")
n2 <- N <- NA
}
}
if (details) { # output to the console
cat(txt)
} else { # data.frame of results
result <- data.frame(alpha.adj = adj, n1 = n1, CVwR = CVwR, GMR = GMR,
target = target, n2 = n2, N = N, futile = FALSE)
if (N == nmax) result$futile <- TRUE
return(result)
}
}
Test of reproducibility
suppressMessages(library(PowerTOST))
suppressMessages(library(adaptIVPT))
swR <- CV2se(0.3)
adj <- 0.0294
calls <- 2500L # time consuming
params <- list(sig_level = adj)
ncores <- NULL
if (!nchar(system.file(package = "parallel")) == 0) {
cores <- parallel::detectCores()
ncores <- cores - 1
}
comp <- data.frame(n2.rss = rep(NA_integer_, calls), rt.rss.1 = NA_real_,
rt.rss.2 = NA_real_, n2.PT = NA_integer_, rt.PT = NA_real_)
# target_power = 0.8 is the default in both functions
for (j in 1L:calls) {
pt <- proc.time()[[3]]
res <- unlist(rss(n = 16, r = 2, S_WR = swR,
params = params, ncores = ncores))
comp$rt.rss.2[j] <- proc.time()[[3]] - pt # measured runtime
comp$rt.rss.1[j] <- res[8] # reported runtime
N <- res[["rss"]] # as reported, no rounding
comp$n2.rss[j] <- N - 16
# We use random seeds in the function as well (not the default);
# however, it will always return balanced sequences (even numbers)
pt <- proc.time()[[3]]
comp$n2.PT[j] <- sampleN.RSABE(alpha = adj, CV = 0.3, theta0 = 0.95,
design = "2x2x4", print = FALSE, details = FALSE,
setseed = FALSE)[["Sample size"]] - 16
comp$rt.PT[j] <- proc.time()[[3]] - pt
}
names(comp)[2:3] <- c("rt.rss.reported", "rt.rss.measured")
ns <- sort(unique(c(as.integer(unlist(comp[1])),
as.integer(unlist(comp[4])))))
vals <- data.frame(n2 = ns, rss = NA_character_, PT = NA_character_)
for (j in seq_along(ns)) {
vals$rss[j] <- sprintf("%5.2f%%", 100 * sum(comp$n2.rss == ns[j]) / calls)
vals$PT[j] <- sprintf("%5.2f%%", 100 * sum(comp$n2.PT == ns[j]) / calls)
}
txt <- paste("\nMean runtimes",
sprintf("%-34s %6.2f", "\n rss() measured / reported",
mean(comp[[3]]) / mean(comp[[2]])),
sprintf("%-34s %6.2f", "\n sampleN.RSABE() / rss() reported",
mean(comp[[5]]) / mean(comp[[2]])),
sprintf("%-34s %6.2f", "\n sampleN.RSABE() / rss() measured",
mean(comp[[5]]) / mean(comp[[3]])), "\n")
summary(comp, digits = 5); print(vals, row.names = FALSE); cat(txt)
Result (if you call the script, yours will be different):
n2.rss rt.rss.reported rt.rss.measured n2.PT rt.PT
Min. :3.000 Min. :0.00000 Min. :0.030000 Min. :4.0000 Min. :0.14000
1st Qu.:4.000 1st Qu.:0.00000 1st Qu.:0.040000 1st Qu.:6.0000 1st Qu.:0.19000
Median :5.000 Median :0.00000 Median :0.050000 Median :6.0000 Median :0.19000
Mean :4.552 Mean :0.00204 Mean :0.044624 Mean :5.9968 Mean :0.19408
3rd Qu.:5.000 3rd Qu.:0.00000 3rd Qu.:0.050000 3rd Qu.:6.0000 3rd Qu.:0.20000
Max. :6.000 Max. :0.02000 Max. :0.070000 Max. :6.0000 Max. :0.24000
n2 rss PT
3 3.32% 0.00%
4 43.68% 0.16%
5 47.48% 0.00%
6 5.52% 99.84%
Mean runtimes
rss() measured / reported 21.87
sampleN.RSABE() / rss() reported 95.14
sampleN.RSABE() / rss() measured 4.35
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- Adaptive Design for the FDA’s RSABE? Helmut 2023-12-18 11:20 [Two-Stage / GS Designs]
- Likely it does not work (potentially inflated Type I Error) Helmut 2023-12-19 11:10
- Exploring package adaptIVPT, function rss()Helmut 2023-12-20 13:27
- Extreme test case Helmut 2023-12-24 13:01
- Extreme GMR Naksh 2023-12-25 04:16
- PE outside {0.80, 1.25} not possible Helmut 2023-12-25 10:54
- PE outside {0.80, 1.25} not possible Naksh 2023-12-25 11:42
- Forget rss() Helmut 2023-12-25 13:15
- Forget rss() Naksh 2023-12-26 04:49
- TSD useful at all? Helmut 2023-12-26 12:50
- Forget rss() Naksh 2023-12-26 04:49
- Forget rss() Helmut 2023-12-25 13:15
- PE outside {0.80, 1.25} not possible Naksh 2023-12-25 11:42
- PE outside {0.80, 1.25} not possible Helmut 2023-12-25 10:54
- Extreme GMR Naksh 2023-12-25 04:16
- Extreme test case Helmut 2023-12-24 13:01
- Exploring package adaptIVPT, function rss()Helmut 2023-12-20 13:27
- Likely it does not work (potentially inflated Type I Error) Helmut 2023-12-19 11:10