## ANOVA acc. to GL [General Sta­tis­tics]

Hi Nastia,

» I meant studies in which volunteers were divided by two groups for the capacity reasons. (Formally this can be true even for studies with 12 subjects if the clinical site is tiny.)

Might also happen in mid-range CROs with drugs which require continuous cardiac monitoring. 12–16 beds is not unusual.

Nevertheless, I don’t understand why Belorussian experts ask for a mixed model because the guideline recommends ANOVA:

88. Сравнение исследуемых фармакокинетических параметров проводят с помощью дисперсионного анализа (ANOVA).

Fixed effects are specifically recommended (taking into account effects which can affect the response):

89. Статистический анализ должен принимать во внимание источники вариабельности, способные повлиять на изучаемую переменную. В такой модели дисперсионного анализа принято использовать такие факторы, как последо­вательность, субъект последо­вательности, период и лекарственный препарат. В отношении всех этих факторов следует использовать фиксированные, а не случайные эффекты. We discussed that ad nauseam. At least in the EU no regulatory statistician assumes that groups have an impact and expects a group-model. Pooled analysis rulez (see above).

However, why all that fuzz? Let’s simulate 100,000 studies:

library(PowerTOST) ngrp <- function(capacity, n) {   # split sample size into >=2 groups based on capacity   if (n <= capacity) { # make equal groups     ngrp <- rep(ceiling(n/2), 2)   } else {             # at least one = capacity     ngrp    <- rep(0, ceiling(n / capacity))     grps    <- length(ngrp)     ngrp <- capacity     for (j in 2:grps) {       n.tot <- sum(ngrp) # what we have so far       if (n.tot + capacity <= n) {         ngrp[j] <- capacity       } else {         ngrp[j] <- n - n.tot       }     }   }   return(ngrp = list(grps = length(ngrp), ngrp = ngrp)) } CV            <- 0.275 target        <- 0.90 theta0        <- 0.95 design        <- "2x2x2" capacity      <- 12 res           <- data.frame(N = NA, groups = NA, n.group = NA,                             pwr.1 = NA, pwr.2 = NA, change = NA) x             <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target,                               design = design, print = FALSE, details = FALSE) res$N <- x[["Sample size"]] res$pwr.1     <- x[["Achieved power"]] x             <- ngrp(capacity = capacity, n = res$N) res$groups    <- x[["grps"]] n.group       <- x[["ngrp"]] res$pwr.2 <- power.TOST.sds(CV = CV, n = res$N, grps = res$groups, ngrp = n.group, gmodel = 2, progress = FALSE) res$n.group   <- paste(n.group, collapse = "|") res$change <- 100*(res$pwr.2 - res$pwr.1)/res$pwr.1 res[4:6]      <- signif(res[4:6], 4) names(res)[3:5] <- c("n / group", "1", "2") if (res$change < 0) { names(res) <- "loss (%)" } else { names(res) <- "gain (%)" } res <- abs(res) txt <- paste0("\nCV = ", CV, ", theta0 = ", theta0, ", targetpower = ", target, ",\ndesign = ", design, "; all effects fixed", "\npower: 1 = pooled data, 2 = group model\n\n") cat(txt); print(res, row.names = FALSE) CV = 0.275, theta0 = 0.95, targetpower = 0.9, design = 2x2x2; all effects fixed power: 1 = pooled data, 2 = group model N groups n / group 1 2 loss (%) 44 4 12|12|12|8 0.9006 0.9004 0.02293 The relative loss in power is practically negligible. Hence, go ahead with the group model to make the experts happy. Point out that a mixed model is against the GL. But you are right that in rare borderline cases a study passing with the pooled analysis might fail with the group model due to the lower degrees of freedom. $$df\text{(pooled model)}=N-2$$ $$df\text{(group model)}=\sum_{i=1}^{i=groups}n_i-(groups-1)-2$$ Of course, the larger the sample size and the smaller the number of groups the impact will be decrease. CV <- 0.275 theta0 <- 0.95 capacity <- 12 var <- CV2mse(CV) N <- seq(24, 48, 2) res <- data.frame(N = N, df.p = N - 2, tcrit.p = qt(1-0.05, N-2), CI.p = NA, groups = NA, n.group = NA, df.g = NA, tcrit.g = NA, CI.g = NA) for (j in seq_along(N)) { CI <- round(100*exp(log(theta0) + c(-1, +1) * res$tcrit.p[j] * sqrt(var/N[j])), 2)   res$CI.p[j] <- sprintf("%6.2f\u2013%6.2f%%", CI, CI) x <- ngrp(capacity = capacity, n = N[j]) res$groups[j]  <- x[["grps"]]   n.group        <- x[["ngrp"]]   res$df.g[j] <- N[j] - (res$groups[j] - 1) - 2   res$tcrit.g[j] <- qt(1-0.05, res$df.g[j])   res$n.group[j] <- paste(n.group, collapse = "|") CI <- round(100*exp(log(theta0) + c(-1, +1) * res$tcrit.g[j] * sqrt(var/N[j])), 2)   res\$CI.g[j]    <- sprintf("%6.2f\u2013%6.2f%%", CI, CI) } res[, c(3, 8)] <- signif(res[, c(3, 8)], 4) names(res)[c(4, 9)] <- c("90% CI (pooled)", "90% CI (groups)") print(res, row.names = FALSE)   N df.p tcrit.p 90% CI (pooled) groups     n.group df.g tcrit.g 90% CI (groups)  24   22   1.717   86.42–104.43%      2       12|12   21   1.721   86.40–104.45%  26   24   1.711   86.77–104.01%      3     12|12|2   22   1.717   86.74–104.04%  28   26   1.706   87.08–103.64%      3     12|12|4   24   1.711   87.06–103.67%  30   28   1.701   87.36–103.31%      3     12|12|6   26   1.706   87.34–103.33%  32   30   1.697   87.61–103.02%      3     12|12|8   28   1.701   87.59–103.04%  34   32   1.694   87.83–102.75%      3    12|12|10   30   1.697   87.82–102.77%  36   34   1.691   88.04–102.51%      3    12|12|12   32   1.694   88.03–102.52%  38   36   1.688   88.23–102.29%      4  12|12|12|2   33   1.692   88.21–102.31%  40   38   1.686   88.40–102.09%      4  12|12|12|4   35   1.690   88.39–102.11%  42   40   1.684   88.56–101.90%      4  12|12|12|6   37   1.687   88.55–101.92%  44   42   1.682   88.71–101.73%      4  12|12|12|8   39   1.685   88.70–101.74%  46   44   1.680   88.85–101.57%      4 12|12|12|10   41   1.683   88.84–101.58%  48   46   1.679   88.98–101.42%      4 12|12|12|12   43   1.681   88.98–101.43%

Same game but with a capacity of 24, theta0 0.87. Only two groups and hence, one df lost:

  N df.p tcrit.p 90% CI (pooled) groups n.group df.g tcrit.g 90% CI (groups)  24   22   1.717   79.14– 95.64%      2   12|12   21   1.721   79.13– 95.65%  26   24   1.711   79.46– 95.25%      2    24|2   23   1.714   79.45– 95.26%  28   26   1.706   79.75– 94.91%      2    24|4   25   1.708   79.74– 94.92%  30   28   1.701   80.00– 94.61%      2    24|6   27   1.703   79.99– 94.62%  32   30   1.697   80.23– 94.34%      2    24|8   29   1.699   80.22– 94.35%  34   32   1.694   80.44– 94.10%      2   24|10   31   1.696   80.43– 94.11%  36   34   1.691   80.63– 93.88%      2   24|12   33   1.692   80.62– 93.88%  38   36   1.688   80.80– 93.68%      2   24|14   35   1.690   80.79– 93.68%  40   38   1.686   80.96– 93.49%      2   24|16   37   1.687   80.95– 93.50%  42   40   1.684   81.11– 93.32%      2   24|18   39   1.685   81.10– 93.33%  44   42   1.682   81.24– 93.16%      2   24|20   41   1.683   81.24– 93.17%  46   44   1.680   81.37– 93.02%      2   24|22   43   1.681   81.37– 93.02%  48   46   1.679   81.49– 92.88%      2   24|24   45   1.679   81.49– 92.88%

Here we have a Nastia-case: With 30 subjects we pass BE with the pooled model (80.00 – 94.61%) but fail with the group model (79.99 – 94.62%).

» Multicentral studies is the other case.

Correct. I would always recommend to include site-terms in the model.

Cheers,
Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes Ing. Helmut Schütz 