ANOVA acc. to GL [General Statistics]
❝ I meant studies in which volunteers were divided by two groups for the capacity reasons. (Formally this can be true even for studies with 12 subjects if the clinical site is tiny.)
Might also happen in mid-range CROs with drugs which require continuous cardiac monitoring. 12–16 beds is not unusual.
Nevertheless, I don’t understand why Belorussian experts ask for a mixed model because the guideline recommends ANOVA:
88. Сравнение исследуемых фармакокинетических параметров проводят с помощью дисперсионного анализа (ANOVA).
Fixed effects are specifically recommended (taking into account effects which can affect the response):89. Статистический анализ должен принимать во внимание источники вариабельности, способные повлиять на изучаемую переменную. В такой модели дисперсионного анализа принято использовать такие факторы, как последовательность, субъект последовательности, период и лекарственный препарат. В отношении всех этих факторов следует использовать фиксированные, а не случайные эффекты.
We discussed that ad nauseam. At least in the EU no regulatory statistician assumes that groups have an impact and expects a group-model. Pooled analysis rulez (see above).
However, why all that fuzz? Let’s simulate 100,000 studies:
library(PowerTOST)
ngrp <- function(capacity, n) {
# split sample size into >=2 groups based on capacity
if (n <= capacity) { # make equal groups
ngrp <- rep(ceiling(n/2), 2)
} else { # at least one = capacity
ngrp <- rep(0, ceiling(n / capacity))
grps <- length(ngrp)
ngrp[1] <- capacity
for (j in 2:grps) {
n.tot <- sum(ngrp) # what we have so far
if (n.tot + capacity <= n) {
ngrp[j] <- capacity
} else {
ngrp[j] <- n - n.tot
}
}
}
return(ngrp = list(grps = length(ngrp), ngrp = ngrp))
}
CV <- 0.275
target <- 0.90
theta0 <- 0.95
design <- "2x2x2"
capacity <- 12
res <- data.frame(N = NA, groups = NA, n.group = NA,
pwr.1 = NA, pwr.2 = NA, change = NA)
x <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target,
design = design, print = FALSE, details = FALSE)
res$N <- x[["Sample size"]]
res$pwr.1 <- x[["Achieved power"]]
x <- ngrp(capacity = capacity, n = res$N)
res$groups <- x[["grps"]]
n.group <- x[["ngrp"]]
res$pwr.2 <- power.TOST.sds(CV = CV, n = res$N, grps = res$groups,
ngrp = n.group, gmodel = 2, progress = FALSE)
res$n.group <- paste(n.group, collapse = "|")
res$change <- 100*(res$pwr.2 - res$pwr.1)/res$pwr.1
res[4:6] <- signif(res[4:6], 4)
names(res)[3:5] <- c("n / group", "1", "2")
if (res$change < 0) {
names(res)[6] <- "loss (%)"
} else {
names(res)[6] <- "gain (%)"
}
res[6] <- abs(res[6])
txt <- paste0("\nCV = ", CV, ", theta0 = ", theta0, ", targetpower = ",
target, ",\ndesign = ", design, "; all effects fixed",
"\npower: 1 = pooled data, 2 = group model\n\n")
cat(txt); print(res, row.names = FALSE)
CV = 0.275, theta0 = 0.95, targetpower = 0.9,
design = 2x2x2; all effects fixed
power: 1 = pooled data, 2 = group model
N groups n / group 1 2 loss (%)
44 4 12|12|12|8 0.9006 0.9004 0.02293
But you are right that in rare borderline cases a study passing with the pooled analysis might fail with the group model due to the lower degrees of freedom.
\(df\text{(pooled model)}=N-2\)
\(df\text{(group model)}=\sum_{i=1}^{i=groups}n_i-(groups-1)-2\)
Of course, the larger the sample size and the smaller the number of groups the impact will be decrease.
CV <- 0.275
theta0 <- 0.95
capacity <- 12
var <- CV2mse(CV)
N <- seq(24, 48, 2)
res <- data.frame(N = N, df.p = N - 2, tcrit.p = qt(1-0.05, N-2),
CI.p = NA, groups = NA, n.group = NA,
df.g = NA, tcrit.g = NA, CI.g = NA)
for (j in seq_along(N)) {
CI <- round(100*exp(log(theta0) + c(-1, +1) *
res$tcrit.p[j] * sqrt(var/N[j])), 2)
res$CI.p[j] <- sprintf("%6.2f\u2013%6.2f%%", CI[1], CI[2])
x <- ngrp(capacity = capacity, n = N[j])
res$groups[j] <- x[["grps"]]
n.group <- x[["ngrp"]]
res$df.g[j] <- N[j] - (res$groups[j] - 1) - 2
res$tcrit.g[j] <- qt(1-0.05, res$df.g[j])
res$n.group[j] <- paste(n.group, collapse = "|")
CI <- round(100*exp(log(theta0) + c(-1, +1) *
res$tcrit.g[j] * sqrt(var/N[j])), 2)
res$CI.g[j] <- sprintf("%6.2f\u2013%6.2f%%", CI[1], CI[2])
}
res[, c(3, 8)] <- signif(res[, c(3, 8)], 4)
names(res)[c(4, 9)] <- c("90% CI (pooled)", "90% CI (groups)")
print(res, row.names = FALSE)
N df.p tcrit.p 90% CI (pooled) groups n.group df.g tcrit.g 90% CI (groups)
24 22 1.717 86.42–104.43% 2 12|12 21 1.721 86.40–104.45%
26 24 1.711 86.77–104.01% 3 12|12|2 22 1.717 86.74–104.04%
28 26 1.706 87.08–103.64% 3 12|12|4 24 1.711 87.06–103.67%
30 28 1.701 87.36–103.31% 3 12|12|6 26 1.706 87.34–103.33%
32 30 1.697 87.61–103.02% 3 12|12|8 28 1.701 87.59–103.04%
34 32 1.694 87.83–102.75% 3 12|12|10 30 1.697 87.82–102.77%
36 34 1.691 88.04–102.51% 3 12|12|12 32 1.694 88.03–102.52%
38 36 1.688 88.23–102.29% 4 12|12|12|2 33 1.692 88.21–102.31%
40 38 1.686 88.40–102.09% 4 12|12|12|4 35 1.690 88.39–102.11%
42 40 1.684 88.56–101.90% 4 12|12|12|6 37 1.687 88.55–101.92%
44 42 1.682 88.71–101.73% 4 12|12|12|8 39 1.685 88.70–101.74%
46 44 1.680 88.85–101.57% 4 12|12|12|10 41 1.683 88.84–101.58%
48 46 1.679 88.98–101.42% 4 12|12|12|12 43 1.681 88.98–101.43%
N df.p tcrit.p 90% CI (pooled) groups n.group df.g tcrit.g 90% CI (groups)
24 22 1.717 79.14– 95.64% 2 12|12 21 1.721 79.13– 95.65%
26 24 1.711 79.46– 95.25% 2 24|2 23 1.714 79.45– 95.26%
28 26 1.706 79.75– 94.91% 2 24|4 25 1.708 79.74– 94.92%
30 28 1.701 80.00– 94.61% 2 24|6 27 1.703 79.99– 94.62%
32 30 1.697 80.23– 94.34% 2 24|8 29 1.699 80.22– 94.35%
34 32 1.694 80.44– 94.10% 2 24|10 31 1.696 80.43– 94.11%
36 34 1.691 80.63– 93.88% 2 24|12 33 1.692 80.62– 93.88%
38 36 1.688 80.80– 93.68% 2 24|14 35 1.690 80.79– 93.68%
40 38 1.686 80.96– 93.49% 2 24|16 37 1.687 80.95– 93.50%
42 40 1.684 81.11– 93.32% 2 24|18 39 1.685 81.10– 93.33%
44 42 1.682 81.24– 93.16% 2 24|20 41 1.683 81.24– 93.17%
46 44 1.680 81.37– 93.02% 2 24|22 43 1.681 81.37– 93.02%
48 46 1.679 81.49– 92.88% 2 24|24 45 1.679 81.49– 92.88%
❝ Multicentral studies is the other case.
Correct. I would always recommend to include site-terms in the model.
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- Group by sequence interaction Mutasim 2019-08-07 13:16 [General Statistics]
- Group by sequence interaction, an urban myth? Helmut 2019-08-07 14:35
- pristine, genuine, holy, magnificent, inexplicable beautiful variation ElMaestro 2019-08-08 10:06
- I love your subject line! Helmut 2019-08-08 10:31
- Group effect, did you miss it? Astea 2019-12-21 14:12
- Group effect, did you miss it? PharmCat 2019-12-22 01:37
- Group effect: the endless story Helmut 2019-12-22 10:30
- Group effect: the endless river Astea 2019-12-22 21:23
- Group effect: the endless river PharmCat 2019-12-22 22:52
- replicateBE solution with interactions mittyri 2019-12-23 14:30
- replicateBE solution with interactions Astea 2019-12-23 17:26
- datasets issues mittyri 2019-12-24 10:53
- datasets Helmut 2019-12-24 11:03
- datasets Astea 2019-12-24 19:18
- lmer / lme Helmut 2019-12-25 19:12
- datasets Astea 2019-12-24 19:18
- replicateBE solution with interactions Astea 2019-12-23 17:26
- Group effect: the endless river Astea 2019-12-22 21:23
- What do you mean exactly? Beholder 2019-12-27 13:44
- What do you mean exactly? Astea 2019-12-29 21:59
- ANOVA acc. to GLHelmut 2019-12-30 12:32
- What do you mean exactly? Astea 2019-12-29 21:59
- Group effect, did you miss it? Astea 2019-12-21 14:12
- I love your subject line! Helmut 2019-08-08 10:31
- pristine, genuine, holy, magnificent, inexplicable beautiful variation ElMaestro 2019-08-08 10:06
- Group by sequence interaction, an urban myth? Helmut 2019-08-07 14:35