- First of all, make sure that you use the not the simple t-test but the Welch-test. The former is sensitive (specifically: anticonservative) to unequal variances and/or group sizes. Hence, that’s important even for equal group sizes.
- FDA Section VI.B.d:
‘[…] equal variances should not be assumed.’
- EMA Section 4.1.8:
‘The statistical analysis should take into account sources of variation that can be reasonably assumed to have an effect on the response variable.’
and . For the setup in Phoenix/WinNonlin see the online User’s Guide.‘The precise model to be used for the analysis should be pre-specified in the protocol.’
If this was not the case, essentially you have two options.
Regulatory acceptance not guaranteed. At the 2nd GBHI conference (September 2016, Rockville) there was a discussion about adding body weight as a covariate in crossover studies in patients because it may change with time. Response of regulators: No (though apparently the FDA was more open to the idea).
❝ In general, it is recommended to have balanced between treatment arms.
Correct – even in a crossover. It is a common misconception that period effects mean out because T and R are affected to the same degree. That’s not correct for unbalanced sequences. However, unless the degree if imbalance is extreme, the bias is small.
Edit: The published Two-Stage-Design methods are also correct in the strict sense for balanced sequences only. At the end an -script where you can try to counteract imbalance by intentionally allocate subjects in the second stage in such a way that in the pooled analysis sequences are as balanced as possible.
Example: Potvin ‘Method B’ (default), 24 subjects dosed in the first stage, 12 eligible in sequence RT and 10 in sequence TR (dropout-rate ≈8.3%), CV 25%, exact sample size re-estimation (default) taking the stage-term in the pooled analysis into account.
TSD(n1 = 24, n1.1 = 12, n1.2 = 10, CV = 25)
TSD-method: 1 (α = 0.0294, GMR = 95%, power = 80%)
Sample size re-estimation: exact
Stage 1
Randomized/dosed subjects : 24
Eligible subjects (drop-outs) : 22 (2)
Eligible subjects in sequences RT|TR : 12|10
Allocation ratio RT/TR : 1:0.8333
Drop-out rate : 8.333%
Interim analysis
Relevant PK metrics’ maximum CV : 25%
Estimated total sample size : 34
Stage 2
Preliminary sample size : 12
Expected drop-out rate : 8.333%
Final sample size (adj. for drop-outs): 14
Randomized subjects in sequences RT|TR: 6|8
Expected eligible subj. in seq. RT|TR : 6|7
Allocation ratio RT/TR : 1:1.167
Pooled data set
Expected eligible subjects : 35
Expected eligible subj. in seq. RT|TR : 18|17
Allocation ratio RT/TR : 1:0.9444 (imbalanced)
Estimated n2 12. Assuming that we will see the same dropout-rate like in the first stage, adjusted n2 14. Instead of dosing seven subjects / sequence, we dose six in sequence RT and eight in sequence TR. If the dropout-rate is realized, we get an allocation-ratio of 1:0.9444, which is not that bad.
TSD <- function(method1 = 1, method2 = 1, n1, n1.1, n1.2, CV, do.2) {
up2even <- function(n) { # get balanced sequences
return(as.integer(2 * (n %/% 2 + as.logical(n %% 2))))
nadj <- function(n, do.r) { # adjust for dropout-rate
return(as.integer(up2even(n / (1 - do.r))))
n1.e <- n1.1 + n1.2 # stage 1: eligible subjects <- n1.2 / n1.1 # stage 1: sequence allocation ratio
do.r <- abs((n1.e - n1) / n1) # stage 1: drop-out rate
if(!missing(do.2)) do.2 <- do.2 / 100 # anticipated drop-out rate stage 2
if(missing(do.2)) do.2 <- do.r # apply 1st if not given
CV <- CV / 100
if (method1 == 1) {adj <- 0.0294; GMR <- 0.95; pwr <- 0.8}
if (method1 == 2) {adj <- 0.0280; GMR <- 0.90; pwr <- 0.8}
if (method1 == 3) {adj <- 0.0284; GMR <- 0.95; pwr <- 0.9}
if (method1 == 4) {adj <- 0.0274; GMR <- 0.95; pwr <- 0.9}
if (method1 == 5) {adj <- 0.0269; GMR <- 0.90; pwr <- 0.9}
if (method2 == 1) me <- "exact"
if (method2 == 2) me <- "nct"
if (method2 == 3) me <- "shifted"
n2.p <- sampleN2.TOST(alpha = adj, CV = CV, n1 = n1.e, theta0 = GMR,
targetpower = pwr, method = me)[["Sample size"]]
nt <- n1.e + n2.p # preliminary total sample size
n2.1 <- nadj(nt/2-n1.1, do.2) # adjust for drop-outs
n2.2 <- nadj(nt/2-n1.2, do.2) # adjust for drop-outs
n2 <- n2.1+n2.2 # dosed in stage 2
n2.1e <- round(n2.1*(1-do.2), 0) # stage 2: expected elig. subjects in seq. 1
n2.2e <- round(n2.2*(1-do.2), 0) # stage 2: expected elig. subjects in seq. 2
n2.e <- n2.1e+n2.2e # stage 2: expected elig. subjects <- n2.2e/n2.1e # stage 2: sequence allocation ratio
ar <- (n1.2+n2.2e)/(n1.1+n2.1e) # pooled data’s allocation ratio
ifelse(ar == 1, bal <- "(balanced)", bal <- "(imbalanced)")
sep <- paste(paste0(rep("\u2500", 50), collapse=""), "\n")
if(method2 > 1) me <- c(me, "t-distribution")
cat("\n TSD-method:", method1,
paste0("(\u03b1 = ", adj, ", GMR = ", 100*GMR, "%, power = ", 100*pwr, "%)\n"),
"Sample size re-estimation:", me, "\n", sep,
"Stage 1\n", sep,
"Randomized/dosed subjects :", n1, "\n",
"Eligible subjects (drop-outs) :", n1.e, paste0("(", n1-n1.e,")"), "\n",
"Eligible subjects in sequences RT|TR :", paste0(n1.1, "|", n1.2), "\n",
"Allocation ratio RT/TR :", paste0("1:", signif(, 4)), "\n",
"Drop-out rate :", paste0(signif(100*do.r, 4), "%\n"), sep,
"Interim analysis\n", sep,
"Relevant PK metrics’ maximum CV :", paste0(signif(100*CV, 4), "%\n"),
"Estimated total sample size :", as.numeric(nt), "\n", sep,
"Stage 2\n", sep,
"Preliminary sample size :", n2.p, "\n",
"Expected drop-out rate :", paste0(signif(100*do.2, 4),"%\n"),
"Final sample size (adj. for drop-outs):", n2, "\n",
"Randomized subjects in sequences RT|TR:", paste0(n2.1, "|", n2.2), "\n",
"Expected eligible subj. in seq. RT|TR :", paste0(n2.1e, "|", n2.2e), "\n",
"Allocation ratio RT/TR :", paste0("1:", signif(, 4)), "\n", sep,
"Pooled data set\n", sep,
"Expected eligible subjects :", n1.e+n2.e, "\n",
"Expected eligible subj. in seq. RT|TR :", paste0(n1.1+n2.1e, "|", n1.2+n2.2e), "\n",
"Allocation ratio RT/TR :", paste0("1:", signif(ar, 4)), bal, "\n\n")
method1 <- 1 # select from TSD-Methods
# GMR% power% #
# 1 Potvin et al. (2008) Methods B/C: 95 80 #
# 2 Montague et al. (2011) Method D: 90 80 #
# 3 Fuglsang (2013) Method B: 95 90 #
# 4 Fuglsang (2013) Method C1/D1: 95 90 #
# 5 Fuglsang (2013) Method C2/D2: 90 90 #
method2 <- 1 # select from power-estimation Methods
# 3 shifted t-distribution: good #
# 2 noncentral t-distribution: better #
# 1 exact (Owen’s Q-function): best #
