## Few subjects and many centers… [Study As­sess­ment]

Hi Kotu,

first of all, please post in an appropriate category in the future (see there). The team had to move the majority of your posts in the category to a suitable one. THX.

❝ We have conducted a bioequivalence study with 32 subjects in multiple centres approx.15 centers. Each center has completed 1 to 3 subjects only, whether center term is required to include in the ANOVA model in such cases?

Do I understand you correctly: You performed a study without knowing how to evaluate it? What does the SAP say?

❝ If the center term is required …

In a multi-center BE study possibly regulators will ask for it. Double standards because in multi-center phase III trials data are quite often simply pooled.1

[…] it may be recognised from the start that the limited numbers of subjects per centre will make it impracticable to include the centre effects in the statistical model. In these cases it is not appropriate to include a term for centre in the model, and it is not necessary to stratify the randomisation by centre in this situation.

❝ … how we can perform the statistical analysis including center term in ANOVA model and what will be the impact on study results.

ANOVA – are you talking about a crossover study? The common model for a multi-center study is:
• center,
• sequence
• treatment
• subject (nested within center × sequence)
• period (nested within center)
• center-by-sequence interaction
If you have in any (‼) of the centers less than two subjects, the model will collapse (GMR and MSE cannot be estimated). Study done, no result. Statistician blamed (best case) or fired (worst case).
What you should not do: Include a center-by-treatment interaction term and test for its significance (see there).

If you are talking about a parallel design, you must not assume equal variances (FDA 2001, Section VI.B.1.d.) and hence, opt for the Welch-test. The degrees of freedom by Satterthwaite’s approximation are given by$$\nu=\frac{\big{(}\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\big{)}^2}{\frac{s_1^4}{n_1^2(n_1-1)}+\frac{s_2^4}{n_2^2(n_2-1)}}\tag{1}$$where $$\small{n_1,n_2}$$ are total number of subjects under treatments T and R, respectively. $$\small{s_1,s_2}$$ are the standard deviations of treatment arms. However, in a multi-center study we are not done yet. Adjust2 the degrees of freedom by$$\nu_\textrm{adj}=\nu-(n_\textrm{c}-1)\tag{2}$$where $$\small{n_\textrm{c}}$$ is the number of centers. If $$\small{n_1=n_2}$$ and $$\small{s_1=s_2}$$ the Welch-test reduces to the common t-test with $$\small{n_1+n_2-2}$$ degrees of freedom. Let’s be optimistic: In your case $$\small{\nu=30}$$ and $$\small{\nu_\textrm{adj}=14}$$. That would result in a massive loss in power.
Assuming a CV of 20% and a T/R-ratio of 0.95 you would achieve a power of ≈76% with 32 subjects in a single center. If you have 13 centers and adjust the degrees of freedom by $$\small{(2)}$$, you would cross the Rubicon of ≈50% power. Like tossing a coin. If you have 15 centers, power would be just ≈35%. Close to betting for an even number in a single roll of a die. An -gimmick at the end.

There is another problem with so few subjects per center. Say, centers differed for any reason (Sumo wrestlers recruited in one and Sadhus in another). What if you have by chance not equally sized groups in both? Very questionable outcome, IMHO.

You can’t fix by analysis
what you bungled by design.

Light RJ, Singer JD, Willett JB. By Design. Cambridge: Harvard University Press; 1990. p. V.

1. International Conference on Harmonisation of Technical Requirements for Registration of Phar­ma­ceu­ti­cals for Human Use. ICH Harmonised Tripartite Guideline. Statistical Principles for Clinical Trials. 5 February 1998. Online.
2. Last year this approach was accepted by the FDA for one of my studies. Didn’t hurt because we expected 176 eligible subjects in 16 centers. Of course, the study was powered for the adjusted approach.

library(PowerTOST) CV      <- 0.20       # assumed (total) CV theta0  <- 0.95       # assumed T/R-ratio target  <- 0.80       # target (desired) power design  <- "parallel" # or "2x2" n       <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target,                         design = design, print = FALSE)[["Sample size"]] max.cnt <- n / 2      # less doesn't make any sense n.c     <- 1:max.cnt res     <- data.frame(n = n, centers = n.c, power = NA_real_) for (j in seq_along(n.c)) {   res$power[j] <- suppressMessages( power.TOST(CV = CV, theta0 = theta0, design = design, n = res$n[j] - (n.c[j] - 1))) } plot(n.c, 100 * res$power, xlab = "centers", ylab = "power (%)", las = 1) grid() res$power <- sprintf("%.2f%%", 100 * res\$power) print(res, row.names = FALSE)

Gives for a parallel design …
  n centers  power  36       1 80.99%  36       2 79.83%  36       3 78.65%  36       4 77.33%  36       5 76.00%  36       6 74.49%  36       7 72.98%  36       8 71.26%  36       9 69.54%  36      10 67.57%  36      11 65.60%  36      12 63.32%  36      13 61.05%  36      14 58.42%  36      15 55.80%  36      16 52.72%  36      17 49.70%  36      18 46.13%

… and for a crossover
  n centers  power  20       1 83.47%  20       2 81.32%  20       3 79.12%  20       4 76.36%  20       5 73.54%  20       6 69.93%  20       7 66.25%  20       8 61.44%  20       9 56.60%  20      10 50.24%

Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes