## Parallel (replicate?) [Two-Stage / GS Designs]

❝ I feel your pain…

❝ Any example how two stage adaptive design is applied/validated for parallel study or replicate design?

*n*

_{1}48–120; GMR 0.95 and 80% power).

However, if you want something else, you need own simulations to find a suitable adjusted

*α*. Optionally you can also explore a futility criterion for the maximum total sample size. Not complicated in the -package

`Power2Stage`

.Hint: In all TSDs the maximum inflation of the Type I Error occurs at a combination of low CV and small

*n*

_{1}. Therefore, explore this area first. Once you found a suitable adjusted

*α*, simulate power and the empiric Type I Error for the entire grid. Regulators will ask you for that.

For parallel TSDs there are two functions in

`Power2Stage`

, namely `power.tsd.pAF()`

and `power.tsd.p()`

:- Function
`power.tsd.pAF()`

performs exactly as described in Fuglsang’s paper, namely the power monitoring steps and the sample size estimation are always based on the pooled*t*-test.

- Function
`power.tsd.p()`

with argument`test="welch"`

on the other hand uses the genuine power of Welch’s test. Moreover it accepts unequal treatment groups in stage 1.

❝ In case of parallel design what kind of statistical factor we need to use Potvin …

Anders used

*α*= 0.294 for the analogues of Potvin’s methods B and C. As usual, a slight inflation of the Type I Error in method C with CV ≤ 20% – which is unlikely in parallel designs anyway. Evaluation by the Welch-Satterthwaite test (for unequal variances and group sizes).

If someone knows what might be meant in the ICH M13A’s Section 2.2.3.4 …

*The use of stratification in the randomisation procedure based on a limited number of known relevant factors is therefore recommended. Those factors are also recommended to be accounted for* […]

❝ … or Bonferroni …

I think (‼) that it will be an acceptable alternative because it is the most conservative one (strictly speaking, it is not correct in a TSD because the hypotheses are not independent).

Assessors love

*Signore Bonferroni*.

❝ … and same for 4-period or 3-period replicate design?

If you mean reference-scaling, no idea. You can try Bonferroni as well. Recently something was published by the FDA but it is wacky (see this post why I think so). I’m not convinced that it is worth the efforts.

Plan the study for the assumed

*CV*

_{wR}(and the

*CV*

_{wT}if you have the information). In reference-scaling the observed

*CV*

_{wR}is taken into account anyway. If the variability is higher than assumed, you can scale more and will gain power. If it is lower than assumed, bad luck. However, the crucial point is – as always – the GMR…

If you mean by ‘3-period replicate design’ the

*partial*replicate (TRR|RTR|RRT) and want to use the FDA’s RSABE, please don’t (see this article why). It is fine for the EMA’s ABEL. If you want a 3-period replicate for the FDA, please opt for one of the

*full*replicates (TRT|RTR, TTR|RRT, or TRR|RTT). Otherwise, you might be in deep shit.

`library(PowerTOST)`

library(Power2Stage)

# start with a fixed sample design

CV <- 0.2 # max. inflation of the Type I Error with small CV

GMR <- 0.9 # realistic with large CV in parallel designs

target <- 0.9 # well…

n <- sampleN.TOST(CV = CV, theta0 = GMR, targetpower = target,

design = "parallel", print = FALSE)[["Sample size"]]

# first stage n1

n1 <- n / 2

# assess the empiric Type I Error at one of the BE-limits (under the Null)

# always one mio simulations (very time consuming…)

# try a range of adjusted alphas

alpha <- seq(0.0292, 0.0306, 0.0001)

sig <- binom.test(0.05 * 1e6, 1e6, alternative = "less",

conf.level = 0.95)$conf.int[2]

alpha <- seq(0.0280, 0.0304, 0.0001)

sig <- binom.test(0.05 * 1e6, 1e6, alternative = "less",

conf.level = 0.95)$conf.int[2]

res <- data.frame(alpha = alpha, TIE = NA_real_, TIE.05 = FALSE,

signif = FALSE, TIE.052 = FALSE)

# TIE.05 checks whether the TIE > 0.05

# signif checks whether the TIE > the limit of the binomial test for 1 mio sim’s

# TIE.052 checks whether the TIE > 0.052 (Potvin’s acceptable inflation)

pb <- txtProgressBar(style = 3)

for (j in seq_along(alpha)) {

res$TIE[j] <- power.tsd.p(method = "B", alpha = rep(alpha[j], 2), n1 = n1,

GMR = GMR, CV = CV, targetpower = target,

test = "welch", theta0 = 1.25, nsims = 1e6)$pBE

if (res$TIE[j] > 0.05) res$TIE.05[j] <- TRUE

if (res$TIE[j] > sig) res$signif[j] <- TRUE

if (res$TIE[j] > 0.052) res$TIE.052[j] <- TRUE

setTxtProgressBar(pb, j / length(alpha))

}

close(pb)

wary <- which(res$TIE.05 == TRUE & res$TIE.052 == FALSE) # belt plus suspenders (EMA?)

res <- res[(head(wary, 1) - 1):(tail(wary, 1) + 1), ] # drop some alphas

names(res)[3:5] <- c(">0.05", "* >0.05",">0.052") # cosmetics

print(res, row.names = FALSE)

` alpha TIE >0.05 * >0.05 >0.052`

0.0293 0.049518 FALSE FALSE FALSE

0.0294 0.050004 TRUE FALSE FALSE

0.0295 0.050178 TRUE FALSE FALSE

0.0296 0.050182 TRUE FALSE FALSE

0.0297 0.050486 TRUE TRUE FALSE

0.0298 0.050777 TRUE TRUE FALSE

0.0299 0.050772 TRUE TRUE FALSE

0.0300 0.050806 TRUE TRUE FALSE

0.0301 0.050974 TRUE TRUE FALSE

0.0302 0.050890 TRUE TRUE FALSE

0.0303 0.051308 TRUE TRUE FALSE

0.0304 0.051535 TRUE TRUE FALSE

0.0305 0.051616 TRUE TRUE FALSE

0.0306 0.052007 TRUE TRUE TRUE

`0.0296`

(Type I Error `0.050182 < 0.050360`

). If you are a disciple of Madame Potvin, even `0.0305`

would be OK (`0.051616 < 0.052`

) . Say, you opted for belt plus suspenders `0.0293`

(`0.049518 < 0.05`

), planned the first stage with 300 subjects, and observed a CV of 40%. You had some dropouts (15 in one group and 20 in the other). Therefore, instead of `n1 = 300,`

specify `n1 = c(135, 130)`

. What can you expect?`power.tsd.p(method = "B", alpha = rep(0.0293, 2), n1 = c(135, 130),`

GMR = 0.9, CV = 0.4, targetpower = 0.9,

npct = c(0.05, 0.25, 0.5, 0.75, 0.95))

`TSD with 2 parallel groups`

Method B: alpha (s1/s2) = 0.0293 0.0293

CIs based on Welch's t-test

Target power in power monitoring and sample size est. = 0.9

Power calculation via non-central t approx.

CV1 and GMR = 0.9 in sample size est. used

No futility criterion

BE acceptance range = 0.8 ... 1.25

CV = 0.4; ntot(stage 1) = 265 (nT, nR = 135, 130); GMR = 0.9

1e+05 sims at theta0 = 0.9 (p(BE) = 'power').

p(BE) = 0.91405

p(BE) s1 = 0.72275

Studies in stage 2 = 27.73%

Distribution of n(total)

- mean (range) = 312.8 (265 ... 628)

- percentiles

5% 25% 50% 75% 95%

265 265 265 390 472

However, in this method you can specify one. Say, you don’t want more than 450 subjects:

`power.tsd.p(method = "B", alpha = rep(0.0293, 2), n1 = c(135, 130),`

GMR = 0.9, CV = 0.4, targetpower = 0.9,

npct = c(0.05, 0.25, 0.5, 0.75, 0.95), Nmax = 450)

`TSD with 2 parallel groups`

Method B: alpha (s1/s2) = 0.0293 0.0293

CIs based on Welch's t-test

Target power in power monitoring and sample size est. = 0.9

Power calculation via non-central t approx.

CV1 and GMR = 0.9 in sample size est. used

Futility criterion Nmax = 450

BE acceptance range = 0.8 ... 1.25

CV = 0.4; ntot(stage 1) = 265 (nT, nR = 135, 130); GMR = 0.9

1e+05 sims at theta0 = 0.9 (p(BE) = 'power').

p(BE) = 0.83875

p(BE) s1 = 0.72275

Studies in stage 2 = 17.91%

Distribution of n(total)

- mean (range) = 292 (265 ... 450)

- percentiles

5% 25% 50% 75% 95%

265 265 265 265 434

Let’s compare now the empiric Type I Errors for both.

`sig <- binom.test(0.05 * 1e6, 1e6, alternative = "less",`

conf.level = 0.95)$conf.int[2]

comp <- data.frame(study = c("no futility", "with futility"),

TIE = NA_real_, TIE.05 = FALSE,

signif = FALSE, TIE.052 = FALSE)

for (j in 1:2) {

if (comp$study[j] == "no futility") {

comp$TIE[j] <- power.tsd.p(method = "B", alpha = rep(0.0293, 2),

n1 = c(135, 130), GMR = 0.9, CV = 0.4,

targetpower = 0.9, test = "welch",

theta0 = 1.25, nsims = 1e6)$pBE

} else {

comp$TIE[j] <- power.tsd.p(method = "B", alpha = rep(0.0293, 2),

n1 = c(135, 130), GMR = 0.9, CV = 0.4,

targetpower = 0.9, test = "welch",

theta0 = 1.25, nsims = 1e6, Nmax = 450)$pBE

}

if (comp$TIE[j] > 0.05) comp$TIE.05[j] <- TRUE

if (comp$TIE[j] > sig) comp$signif[j] <- TRUE

if (comp$TIE[j] > 0.052) comp$TIE.052[j] <- TRUE

}

names(comp)[3:5] <- c(">0.05", "* >0.05",">0.052")

print(comp, row.names = FALSE)

```
study TIE >0.05 * >0.05 >0.052
```

no futility 0.045936 FALSE FALSE FALSE

with futility 0.040638 FALSE FALSE FALSE

A caveat: Actually it is not

*that*simple. In practice you have to repeat this exercise for a range of unequal variances and group sizes in the first stage. It might be that you have to adjust more based on the worst case combination. I did that some time ago. Took me a week, four simultaneous -sessions, CPU-load close to 90%…

*Dif-tor heh smusma*🖖🏼 Довге життя Україна!

_{}

Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮

Science Quotes

### Complete thread:

- A rant Helmut 2024-08-26 12:52 [Two-Stage / GS Designs]
- A rant Achievwin 2024-08-28 02:16
- Parallel (replicate?)Helmut 2024-08-28 10:54

- A rant Achievwin 2024-08-28 02:16