Bioequivalence and Bioavailability Forum

Mutasim ☆ Jordan, 2019-08-07 15:16 (1715 d 18:37 ago) Posting: # 20484 Views: 9,097	Group by sequence interaction [General Statistics] Post reply
	Dear Members, What is the explanation for and the impact of having a significant group by sequence interaction in a study with 4 groups and 2 sequences and 4 periods (HVD).

Helmut
★★★

Vienna, Austria,
2019-08-07 16:35
(1715 d 17:19 ago)

@ Mutasim
Posting: # 20485
Views: 7,887

Group by sequence interaction, an urban myth?

Post reply

Salam Mutasim,

❝ What is the explanation for …

Chance.

❝ … and the impact of having a significant group by sequence interaction in a study with 4 groups and 2 sequences and 4 periods (HVD).

If you are dealing with ABEL (the EMA’s method) chances than any (!) European agency will ask for a group-term in the analysis are close to nil. When I gave my first presentation about it in Budapest 2017 in front of European regulatory (!) statisticians they were asking in disbelieve “What the hell are you talking about?” See what the BE-GL states:

The study should be designed in such a way that the formulation effect can be distinguished from other effects.
The precise model to be used for the analysis should be pre-specified in the protocol. The statistical analysis should take into account sources of variation that can be reasonably assumed to have an effect on the response variable.

Is it reasonable to assume that groups or sequences have an effect on the PK response? Heck, no way! Same inclusion-/exclusion-criteria and hence, similar demographics, same clinical site, same bioanalytical method. Hundreds (thousands?) of studies performed in multiple groups and accepted by European agencies based on pooled data. OK, I know of two cases where the EMA (centralised procdure) asked for something similar. Both were complicated drugs (a biosimilar and a liposomal product) and multi-centric. Hence, the EMA asked for a site term.*
This is in line what is stated in the Q&A document about Two-Stage Designs:

A model which also includes a term for a formulation*stage interaction would give equal weight to the two stages, even if the number of subjects in each stage is very different. The results can be very misleading hence such a model is not considered acceptable. […] this model assumes that the formulation effect is truly different in each stage. If such an assumption were true there is no single formulation effect that can be applied to the general population, and the estimate from the study has no real meaning.

Replace formulation*stage by formulation*group and you get the idea.

But all of this is about a group-by-treatment interaction. Sometimes in conventional 2×2×2 crossovers we see a significant sequence (better unequal carry-over) effect. Since it can only be avoided by design (sufficiently long washout) any test for it is futile. Check for eventual residual concentrations in higher period(s) and exclude subjects with pre-dose concentrations >5% of their C_max-values. In analogy the same can be said about the group-by-sequence interaction. Occasionally you will see a significant result. Ignore it.
Only if you prefer braces plus suspenders: Go with the FDA’s model II. But no pre-test and no interaction! The loss in power as compared to the pooled analysis is negligible.
Degrees of freedom in a 2×2×2 design:

Model III: n₁ + n₂ – 2
Model II: n₁ + n₂ – (N_groups – 1) – 2

Or in the 4-period full replicate:

Model III: 3(n₁ + n₂) – 4
Model II: 3(n₁ + n₂) – (N_groups – 1) – 4

Am I right that you had to deal with 128 subjects? Then the 90% CI with model II will be just 0.002% (!) wider than the one with model III.
What you never ever should do: Evaluate groups separately. Power will be terrible. Even if you are extremely lucky and one of them passes, what will you do? Submit only this one? Any agency will ask for the others as well. Guess the outcome.

If you are thinking about the FDA’s group models I–III: There were very few deficiency letters (all with the same text) issued to US and Canadian companies. On the other hand, many European CROs have such letters collecting dust in archives. Politics? Even if you follow this track, model III (pooling) is justified since the conditions are practically always fulfilled. The sequential procedure (test for a group-by-treatment interaction at the 10% level) as any pre-test might inflate the type I error. I even have a statement from the EMA that such a procedure is not acceptable. Group-by-sequence interaction? Forget it.

Proposed all-fixed effects model:
Site+Sequence+Sequence(Site)+Period(Site)+Subject(Sequence×Site)+Treatment

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2019-08-08 12:06
(1714 d 21:47 ago)

@ Helmut
Posting: # 20486
Views: 7,637

pristine, genuine, holy, magnificent, inexplicable beautiful variation

Post reply

Hi Hötzi,

❝ The precise model to be used for the analysis should be pre-specified in the protocol. The statistical analysis should take into account sources of variation that can be reasonably assumed to have an effect on the response variable.

❝ Is it reasonable to assume that groups or sequences have an effect on the PK response? Heck, no way! Same inclusion-/exclusion-criteria and hence, similar demographics, same clinical site, same bioanalytical method. Hundreds (thousands?) of studies performed in multiple groups and accepted by European agencies based on pooled data.

But on the other hand: Why then then include e.g. period?

Let us for a moment disregard the actual wording. I don't think this is about assuming that some factor has or hasn't an effect (and not about significance in the statistical sense either). As I see it I want to construct a CI which is as wide as my real uncertainty dictates. I start out with a whole bunch of ugly variation and in the fashion of Michelangelo working on his crude blocks of marble I chip parts and bits away from my bulk of variation by applying my model. What I have left of my variation is a chunk of pristine, genuine, holy, magnificent, beautiful variation. What a sight to behold :-)

, and whose origin my experimental setup cannot account for, which I therefore use for my CI. My confidence interval now is as wide as just exactly that uncertainty merits.

For a crossover this is of little practical importance since subjects are in groups. For a parallel trial I think I want group in the model. If regulators don't like this, they can ask me to take it away. I happily do so without protesting. I am a sheep at that point. But not until then. :-D

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2019-08-08 12:31
(1714 d 21:22 ago)

@ ElMaestro
Posting: # 20487
Views: 7,670

I love your subject line!

Post reply

Ahoy, my Capt’n,

❝ ❝ The precise model [ :blahblah: ]

❝

❝ But on the other hand: Why then then include e.g. period?

Cause otherwise eventual period effects would not mean out. ;-)

Given, sometimes one has to assume lacking period effects and everybody is happy with that. If an originator explores whether the drug follows linear PK, we have a paired design (SD → saturation → steady state) and compare AUC_0–τ with AUC_0–∞. A crossover would be a logistic nightmare.

❝ Let us for a moment disregard the actual wording. [~~lengthy beautiful explanation~~]

Exactly.

❝ […] For a parallel trial I think I want group in the model. If regulators don't like this, they can ask me to take it away. I happily do so without protesting. I am a sheep at that point. But not until then. :-D

Agree again.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Astea
★★

Russia,
2019-12-21 15:12
(1579 d 17:41 ago)

@ Helmut
Posting: # 21011
Views: 6,923

Group effect, did you miss it?

Post reply

Dear Friends!

Recently I've known that Belorussian experts referring to EAEC rules require to recalculate the results of large studies using FDA's Model II, that is: Group, Sequence, Treatment, Period(Group), Group×Sequence as fixed and Subject(Group×Sequence) as Random though it was not stated in protocol beforehand.

I don't think it is a good idea. Moreover it could be contagious! What do you think about it?

Critical points: as standard model requires Sequence, Treatment, Period and Subject(Sequence) as fixed terms, the results of the random effect model obviously will be different. What if they will change the overall result of BE? What if the Group×Sequence effect would be significant by chance?

How to deal with excluded subjects? For example, if we have a subject with only one period the fixed-effect model would automatically neglect it while the random-effect model would use it.

As I understand currently it is impossible to calculate it via R, isn't it? May be Julia will help? So only people with commercial software could deal with it.

—
"Being in minority, even a minority of one, did not make you mad"

PharmCat
★

Russia,
2019-12-22 02:37
(1579 d 06:16 ago)

@ Astea
Posting: # 21014
Views: 6,929

Group effect, did you miss it?

Post reply

Hello Astea!

❝ As I understand currently it is impossible to calculate it via R, isn't it? May be Julia will help? So only people with commercial software could deal with it.

I think any mixed model without replication (repeated factor) can be fitted in R and in Julia without problems. If variance components include repeated factor (structured R matrix) only special cases available in free software.

Helmut
★★★

Vienna, Austria,
2019-12-22 11:30
(1578 d 21:23 ago)

@ Astea
Posting: # 21015
Views: 6,945

Group effect: the endless story

Post reply

Hi Nastia,

❝ Recently I've known that Belorussian experts referring to EAEC rules require to recalculate the results of large studies using FDA's Model II, that is: Group, Sequence, Treatment, Period(Group), Group×Sequence as fixed and Subject(Group×Sequence) as Random though it was not stated in protocol beforehand.

Regulators are always right. :-D

❝ I don't think it is a good idea. Moreover it could be contagious! What do you think about it?

Recalculating a study is never a good idea. Entire α already spent, right?

❝ Critical points: as standard model requires Sequence, Treatment, Period and Subject(Sequence) as fixed terms, the results of the random effect model obviously will be different.

Possible but only to a (very) small degree.

❝ What if they will change the overall result of BE?

You mean that the fixed model passes and the mixed model fails? Cannot imagine to happen (see below).

❝ What if the Group×Sequence effect would be significant by chance?

Not even the FDA asks for testing Group×Sequence in Model II.

❝ How to deal with excluded subjects? For example, if we have a subject with only one period the fixed-effect model would automatically neglect it while the random-effect model would use it.

Yep. On the other hand, the residual variance in the mixed model should not be larger than the one of fixed model. I had a few cases in the past where European regulators asked for recalculation according to the new GL (the mixed model was my standard before). The CI was sometimes narrower but differed only in the second decimal place…

❝ As I understand currently it is impossible to calculate it via R, isn't it?

I agree with PharmCat. Doable in R.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Astea
★★

Russia,
2019-12-22 22:23
(1578 d 10:31 ago)

@ Helmut
Posting: # 21018
Views: 6,953

Group effect: the endless river

Post reply

Dear Breathtakingly Smart People!

❝ I agree with PharmCat. Doable in R.

Please check my solution...
Below is my R-code:


library(readxl)

library(lmerTest)

options(contrasts=c("contr.treatment","contr.poly"))

Dataset<-read_excel("DS_01_group.xlsx", sheet = 1)

Model2 <- function(Dataset){

Dataset$Formulation<-factor(Dataset$Formulation, levels = c("R","T"))

Dataset$Sequence<-factor(Dataset$Sequence, levels = c("TRTR", "RTRT"))

Dataset$Period<-factor(Dataset$Period)

Dataset$Group<-factor(Dataset$Group)

mod.lmer <- (lmer(log(Data)~Sequence+Period/Group+Formulation+Group+Group*Sequence+(1|Subject/(Sequence)), data=Dataset))

FormulationEffect.lmer <- summary(mod.lmer)$coefficients["FormulationT","Estimate"]
cat("\n", "PE is", round(exp(as.numeric(FormulationEffect.lmer))*100, digits = 8), "\n")

FormulationT<-summary(mod.lmer)$coefficients["FormulationT",]  

CI_Lr<-round(exp(FormulationT[1]+FormulationT[2]*qt(0.05, FormulationT[3]))*100, 4)

CI_Ur<-round(exp(FormulationT[1]-FormulationT[2]*qt(0.05, FormulationT[3]))*100, 4)

cat("\n", "CI is", CI_Lr, "-", CI_Ur,"\n")}

Model2(Dataset)

In order to check whether it works properly I used Data set 1 from Q&A and added group-factor: first 30 subjects - group #1, and the last - group #2.

To compare the results I used the following SAS-code:


proc mixed data=SASuser.Dataset;

class Formulation Subject Period Sequence Group;

model PKlog= Sequence Formulation Period(Group) Group Sequence*Group;

random Subject(Sequence*Group);

estimate "test-ref" Formulation -1 1 / CL alpha=0.10;

run;

Cause I wasn't sure that my code works well I used the step-by-step approach and compared different models (they all may be meaningless by nature but whatever):
1). Sequence Formulation Period as Fixed, Subject(Sequence) as Random
2). Sequence Formulation Period Group as Fixed, Subject(Sequence) as Random
3). Sequence Formulation Period(Group) Group as Fixed, Subject(Sequence) as Random
4). Sequence Formulation Period(Group) Group Sequence*Group as Fixed, Subject(Sequence) as Random
5). Sequence Formulation Period(Group) Group Sequence*Group as Fixed, Subject(Sequence*Group) as Random

The results are as follows:

       SAS         SAS         SAS         R         R            R

#   PE         90%CI, L   90%CI, U   PE         90%CI, L   90%CI, U

1   115,7298   107,1707   124,9725   115,7298   107,1707   124,9725

2   115,7279   107,1680   124,9716   115,7279   107,1679   124,9716

3   115,7208   107,1080   125,0262   115,7208   107,1080   125,0262

4   115,7275   107,1136   125,0340   115,7275   107,1136   125,0340

5   115,7275   107,1136   125,0340   NA         NA         NA

And now, attention, the questions:
1). Looking the same is not proving to be the same, isn't it?
2). By the way, this is exactly an example where the design changes the decision of BE (of course if scaling was not stated in protocol). BE or not BE?
3). When I try to use Subject(Sequence*Group) as Random in R, it throws an error: "couldn't evaluate grouping factor (Sequence * Group):Subject within model frame: try adding grouping factor to data frame explicitly if possible". Is it right that we can simply neglect the structure of this term as we can do in standard models with Subject instead of Subject(Sequence)?

—
"Being in minority, even a minority of one, did not make you mad"

PharmCat
★

Russia,
2019-12-22 23:52
(1578 d 09:02 ago)

(edited by PharmCat on 2019-12-23 17:04)
@ Astea
Posting: # 21019
Views: 6,970

Group effect: the endless river

Post reply

Hi!

❝ Is it right that we can simply neglect the structure of this term as we can do in standard models with Subject instead of Subject(Sequence)?

Edited.

You can use just "Subject" in random statement in SAS. I think if you try to do this, you should get the same results. Of course, if your subject numbers not repeated.

Data:

DATA data;

infile datalines dlm=" " truncover;

input subject $ house $ room $ var;

datalines;

1 1 1 4

1 1 1 6

2 1 1 6

2 1 1 10

3 1 2 2 

3 1 2 4

4 1 2 2

4 1 2 3

5 2 3 8

5 2 3 10

6 2 3 6

6 2 3 10

7 2 4 4

7 2 4 6

;

RUN;

Code 1

PROC MIXED data=data; 

CLASSES subject house room;

MODEL  var = house room/ DDFM=SATTERTH solution;

RANDOM subject/V G;

ESTIMATE 'house' house -1 1/e;

RUN;

Fixed effect without denoted nested factors: problem with results, but coefficient calculated correctly.

Code 2


PROC MIXED data=data; 

CLASSES subject house room;

MODEL  var = house room(house)/ DDFM=SATTERTH solution;

RANDOM subject/V G;

ESTIMATE 'house' house -1 1/e;

RUN;

Fixed effect with nested factor: no problems, but Subjects in Dimensions = 1. Look at horrible V matrix. In this case it does not matter, but it not corresponds with reality.

Code 3

PROC MIXED data=data; 

CLASSES subject house room;

MODEL  var = house room(house)/ DDFM=SATTERTH solution;

RANDOM intercept/SUBJECT=subject V G;

ESTIMATE 'house' house -1 1/e;

RUN;

Good fit. Subjects in Dimensions = 8. We have individual V matrices and ready for structuring.

Code 4


PROC MIXED data=data; 

CLASSES subject house room;

MODEL  var = house room(house)/ DDFM=SATTERTH solution;

RANDOM intercept/SUBJECT=subject(room) V G;

ESTIMATE 'house' house -1 1/e;

RUN;

Nested random part - no effect.

I can't find any difference between subject and subject(room). As I understand nested factors affect on fixed part. I don't know how nested statment can affect on random part, but I think - in this case there is no effect.

mittyri
★★

Russia,
2019-12-23 15:30
(1577 d 17:24 ago)

@ Astea
Posting: # 21020
Views: 6,814

replicateBE solution with interactions

Post reply

Dear Nastya,

whenever possible use simple solutions, at least till the moment you must dive into mixed models galaxy ;-)

And more: interaction terms are not the same in SAS and R
take a look at method.B implementation inside replicateBE package
I just modified it a bit

library(replicateBE)

library(nlme)

alpha <- 0.05

options(contrasts=c("contr.treatment","contr.poly"))

Dataset <- rds01

Dataset$group<- factor(ifelse(as.numeric(levels(Dataset$subject))[Dataset$subject]<31, 1,2))

modB <- lme(log(PK) ~ sequence + group + sequence:group + period + period%in%group + treatment, random = ~1 | subject, na.action = na.omit, data = Dataset)



EMA.B <- summary(modB)

PE <- EMA.B$tTable["treatmentT", "Value"]
# exp(PE) = 1.157275

CI <- exp(PE + c(-1, +1) * qt(1 - alpha, EMA.B$tTable["treatmentT", "DF"]) * EMA.B$tTable["treatmentT", "Std.Error"])

# CI = 1.071136 1.250340

—
Kind regards,
Mittyri

Astea
★★

Russia,
2019-12-23 18:26
(1577 d 14:27 ago)

@ mittyri
Posting: # 21022
Views: 6,781

replicateBE solution with interactions

Post reply

Dear mittyri!
Thanks a lot for your elegant solution!
Didn't you think that replicateBE was the first thing that I've thought of, but I couldn't dig up the code for get.data in order to modify it for my needs. The question that I could not solve was extremely stupid: namely, what is the difference between Dataset<-rds01 and Dataset<-read_excel("rds01.xlsx", sheet = 1)? The initial dataset is the same, but the results of the algo are different. So before using we should somehow prepare the data, but how to do it? "I faced the wall...and did it my way" ;-)

—
"Being in minority, even a minority of one, did not make you mad"

mittyri ★★ Russia, 2019-12-24 11:53 (1576 d 21:01 ago) @ Astea Posting: # 21023 Views: 6,736	datasets issues Post reply
	Dear Nastya, ❝ what is the difference between Dataset<-rds01 and Dataset<-read_excel("rds01.xlsx", sheet = 1)? there could be a problem with column types. check `Dataset1<-read_excel("rds01.xlsx", sheet = 1) str(Dataset1) Dataset2<-rds01 str(Dataset2)` — Kind regards, Mittyri

Helmut
★★★

Vienna, Austria,
2019-12-24 12:03
(1576 d 20:50 ago)

@ Astea
Posting: # 21024
Views: 6,778

datasets

Post reply

Hi Nastia,

❝ […] replicateBE was the first thing that I've thought of, but I couldn't dig up the code for get.data in order to modify it for my needs.

I would not try to use this function. It calls others which are not exported. Not worth the efforts to modify it.

❝ […] what is the difference between Dataset<-rds01 and Dataset<-read_excel("rds01.xlsx", sheet = 1)? The initial dataset is the same, but the results of the algo are different. So before using we should somehow prepare the data, but how to do it?

Duno how you accessed the datasets. Try this one:

library(replicateBE) library(readxl) library(nlme) path <- "your path" file <- "rds01.xlsx" name <- paste0(path, "/", file) DS1 <- as.data.frame(read_excel(path = name, sheet = 1, na = c("NA", "ND", ".", "", "Missing"), skip = 0, col_names = TRUE)) str(DS1) # show the structure: data.frame! cols <- c("subject", "period", "sequence", "treatment") DS1[cols] <- lapply(DS1[cols], factor) # factorize DS2 <- rds01 str(DS2) # show the structure: named S3-object, factorized data.frame # add the groups as factors DS1$group <- factor(ifelse(as.numeric(levels(DS1$subject))[DS1$subject] < 31, 1, 2)) DS2$group <- factor(ifelse(as.numeric(levels(DS2$subject))[DS2$subject] < 31, 1, 2)) res <- data.frame(origin = c("Excel", "replicateBE"), PE = NA, CL.lo = NA, CL.hi = NA) ow <- options("contrasts") # save options options(contrasts = c("contr.treatment", "contr.poly")) on.exit(ow) # reset options if an error occurs for (j in 1:nrow(res)) { if (j == 1) data = DS1 else data <- DS2 modB <- lme(log(PK) ~ sequence + group + sequence:group + period + period%in%group + treatment, random = ~1 | subject, na.action = na.omit, data = data) EMA.B <- summary(modB) PE <- EMA.B$tTable["treatmentT", "Value"] res[j, 3:4] <- 100*exp(PE + c(-1, +1) * qt(1 - 0.05, EMA.B$tTable["treatmentT", "DF"]) * EMA.B$tTable["treatmentT", "Std.Error"]) res$PE[j] <- 100*exp(PE) } print(res, row.names = FALSE) # should be identical # origin PE CL.lo CL.hi # Excel 115.7275 107.1136 125.034 # replicateBE 115.7275 107.1136 125.034

Edit: Hey Mittyri, you were faster!

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Astea ★★ Russia, 2019-12-24 20:18 (1576 d 12:35 ago) @ Helmut Posting: # 21029 Views: 6,724	datasets Post reply
	Dear Friends! Thanks a lot for your inestimable help! I finally found. The key point was just adding the string: Dataset$period<-factor(Dataset$period). Now all works the same. I still struggle to understand what is the difference between lmer and lme in this case... — "Being in minority, even a minority of one, did not make you mad"

Helmut
★★★

Vienna, Austria,
2019-12-25 20:12
(1575 d 12:42 ago)

@ Astea
Posting: # 21030
Views: 6,805

lmer / lme

Post reply

Hi Nastia,

❝ I still struggle to understand what is the difference between lmer and lme in this case...

The syntax of the formula for random effects is different.

mod <- lmer(logPK ~ sequence + period + treatment + (1|subject), data = data) mod <- lme(log(PK) ~ sequence + period + treatment, random = ~1|subject, data = data, na.action = na.omit)

Important: If you have incomplete data, in lme() you have to use na.action = na.omit because the default na.action = na.fail would stop and throw an error.

The method to extract the PE and calculate the CI is also different. Hint: Lines 53–56 and 80–84 of the source-code of method.B.R.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Beholder ★ Russia, 2019-12-27 14:44 (1573 d 18:09 ago) @ Astea Posting: # 21031 Views: 6,528	What do you mean exactly? Post reply
	Dear Astea! ❝ large studies Saying "large studies" do you mean the multicentral BE studies or large amount of subjects, e.g. 70 subjects in one clinic divided into two groups? — Best regards Beholder

Astea
★★

Russia,
2019-12-29 22:59
(1571 d 09:54 ago)

@ Beholder
Posting: # 21035
Views: 6,459

What do you mean exactly?

Post reply

Dear Beholder!

❝ Saying "large studies" do you mean the multicentral BE studies or large amount of subjects, e.g. 70 subjects in one clinic divided into two groups?

I'm sorry for the wrong terminology used. I meant studies in which volunteers were divided by two groups for the capacity reasons. (Formally this can be true even for studies with 12 subjects if the clinical site is tiny.)

Multicentral studies is the other case.

Helmut
★★★

Vienna, Austria,
2019-12-30 13:32
(1570 d 19:21 ago)

@ Astea
Posting: # 21036
Views: 6,618

ANOVA acc. to GL

Post reply

Hi Nastia,

❝ I meant studies in which volunteers were divided by two groups for the capacity reasons. (Formally this can be true even for studies with 12 subjects if the clinical site is tiny.)

Might also happen in mid-range CROs with drugs which require continuous cardiac monitoring. 12–16 beds is not unusual.

Nevertheless, I don’t understand why Belorussian experts ask for a mixed model because the guideline recommends ANOVA:

88. Сравнение исследуемых фармакокинетических параметров проводят с помощью дисперсионного анализа (ANOVA).

Fixed effects are specifically recommended (taking into account effects which can affect the response):

89. Статистический анализ должен принимать во внимание источники вариабельности, способные повлиять на изучаемую переменную. В такой модели дисперсионного анализа принято использовать такие факторы, как последовательность, субъект последовательности, период и лекарственный препарат. В отношении всех этих факторов следует использовать фиксированные, а не случайные эффекты.

We discussed that ad nauseam. At least in the EU no regulatory statistician assumes that groups have an impact and expects a group-model. Pooled analysis rulez (see above).

However, why all that fuzz? Let’s simulate 100,000 studies:

library(PowerTOST) ngrp <- function(capacity, n) { # split sample size into >=2 groups based on capacity if (n <= capacity) { # make equal groups ngrp <- rep(ceiling(n/2), 2) } else { # at least one = capacity ngrp <- rep(0, ceiling(n / capacity)) grps <- length(ngrp) ngrp[1] <- capacity for (j in 2:grps) { n.tot <- sum(ngrp) # what we have so far if (n.tot + capacity <= n) { ngrp[j] <- capacity } else { ngrp[j] <- n - n.tot } } } return(ngrp = list(grps = length(ngrp), ngrp = ngrp)) } CV <- 0.275 target <- 0.90 theta0 <- 0.95 design <- "2x2x2" capacity <- 12 res <- data.frame(N = NA, groups = NA, n.group = NA, pwr.1 = NA, pwr.2 = NA, change = NA) x <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target, design = design, print = FALSE, details = FALSE) res$N <- x[["Sample size"]] res$pwr.1 <- x[["Achieved power"]] x <- ngrp(capacity = capacity, n = res$N) res$groups <- x[["grps"]] n.group <- x[["ngrp"]] res$pwr.2 <- power.TOST.sds(CV = CV, n = res$N, grps = res$groups, ngrp = n.group, gmodel = 2, progress = FALSE) res$n.group <- paste(n.group, collapse = "|") res$change <- 100*(res$pwr.2 - res$pwr.1)/res$pwr.1 res[4:6] <- signif(res[4:6], 4) names(res)[3:5] <- c("n / group", "1", "2") if (res$change < 0) { names(res)[6] <- "loss (%)" } else { names(res)[6] <- "gain (%)" } res[6] <- abs(res[6]) txt <- paste0("\nCV = ", CV, ", theta0 = ", theta0, ", targetpower = ", target, ",\ndesign = ", design, "; all effects fixed", "\npower: 1 = pooled data, 2 = group model\n\n") cat(txt); print(res, row.names = FALSE) CV = 0.275, theta0 = 0.95, targetpower = 0.9, design = 2x2x2; all effects fixed power: 1 = pooled data, 2 = group model N groups n / group 1 2 loss (%) 44 4 12|12|12|8 0.9006 0.9004 0.02293

The relative loss in power is practically negligible. Hence, go ahead with the group model to make the experts happy. Point out that a mixed model is against the GL.

But you are right that in rare borderline cases a study passing with the pooled analysis might fail with the group model due to the lower degrees of freedom.
$df\text{(pooled model)}=N-2$
$df\text{(group model)}=\sum_{i=1}^{i=groups}n_i-(groups-1)-2$
Of course, the larger the sample size and the smaller the number of groups the impact will be decrease.

CV <- 0.275 theta0 <- 0.95 capacity <- 12 var <- CV2mse(CV) N <- seq(24, 48, 2) res <- data.frame(N = N, df.p = N - 2, tcrit.p = qt(1-0.05, N-2), CI.p = NA, groups = NA, n.group = NA, df.g = NA, tcrit.g = NA, CI.g = NA) for (j in seq_along(N)) { CI <- round(100*exp(log(theta0) + c(-1, +1) * res$tcrit.p[j] * sqrt(var/N[j])), 2) res$CI.p[j] <- sprintf("%6.2f\u2013%6.2f%%", CI[1], CI[2]) x <- ngrp(capacity = capacity, n = N[j]) res$groups[j] <- x[["grps"]] n.group <- x[["ngrp"]] res$df.g[j] <- N[j] - (res$groups[j] - 1) - 2 res$tcrit.g[j] <- qt(1-0.05, res$df.g[j]) res$n.group[j] <- paste(n.group, collapse = "|") CI <- round(100*exp(log(theta0) + c(-1, +1) * res$tcrit.g[j] * sqrt(var/N[j])), 2) res$CI.g[j] <- sprintf("%6.2f\u2013%6.2f%%", CI[1], CI[2]) } res[, c(3, 8)] <- signif(res[, c(3, 8)], 4) names(res)[c(4, 9)] <- c("90% CI (pooled)", "90% CI (groups)") print(res, row.names = FALSE) N df.p tcrit.p 90% CI (pooled) groups n.group df.g tcrit.g 90% CI (groups) 24 22 1.717 86.42–104.43% 2 12|12 21 1.721 86.40–104.45% 26 24 1.711 86.77–104.01% 3 12|12|2 22 1.717 86.74–104.04% 28 26 1.706 87.08–103.64% 3 12|12|4 24 1.711 87.06–103.67% 30 28 1.701 87.36–103.31% 3 12|12|6 26 1.706 87.34–103.33% 32 30 1.697 87.61–103.02% 3 12|12|8 28 1.701 87.59–103.04% 34 32 1.694 87.83–102.75% 3 12|12|10 30 1.697 87.82–102.77% 36 34 1.691 88.04–102.51% 3 12|12|12 32 1.694 88.03–102.52% 38 36 1.688 88.23–102.29% 4 12|12|12|2 33 1.692 88.21–102.31% 40 38 1.686 88.40–102.09% 4 12|12|12|4 35 1.690 88.39–102.11% 42 40 1.684 88.56–101.90% 4 12|12|12|6 37 1.687 88.55–101.92% 44 42 1.682 88.71–101.73% 4 12|12|12|8 39 1.685 88.70–101.74% 46 44 1.680 88.85–101.57% 4 12|12|12|10 41 1.683 88.84–101.58% 48 46 1.679 88.98–101.42% 4 12|12|12|12 43 1.681 88.98–101.43%

Same game but with a capacity of 24, theta0 0.87. Only two groups and hence, one df lost:

N df.p tcrit.p 90% CI (pooled) groups n.group df.g tcrit.g 90% CI (groups) 24 22 1.717 79.14– 95.64% 2 12|12 21 1.721 79.13– 95.65% 26 24 1.711 79.46– 95.25% 2 24|2 23 1.714 79.45– 95.26% 28 26 1.706 79.75– 94.91% 2 24|4 25 1.708 79.74– 94.92% 30 28 1.701 80.00– 94.61% 2 24|6 27 1.703 79.99– 94.62% 32 30 1.697 80.23– 94.34% 2 24|8 29 1.699 80.22– 94.35% 34 32 1.694 80.44– 94.10% 2 24|10 31 1.696 80.43– 94.11% 36 34 1.691 80.63– 93.88% 2 24|12 33 1.692 80.62– 93.88% 38 36 1.688 80.80– 93.68% 2 24|14 35 1.690 80.79– 93.68% 40 38 1.686 80.96– 93.49% 2 24|16 37 1.687 80.95– 93.50% 42 40 1.684 81.11– 93.32% 2 24|18 39 1.685 81.10– 93.33% 44 42 1.682 81.24– 93.16% 2 24|20 41 1.683 81.24– 93.17% 46 44 1.680 81.37– 93.02% 2 24|22 43 1.681 81.37– 93.02% 48 46 1.679 81.49– 92.88% 2 24|24 45 1.679 81.49– 92.88%

Here we have a Nastia-case: With 30 subjects we pass BE with the pooled model (80.00 – 94.61%) but fail with the group model (79.99 – 94.62%).

❝ Multicentral studies is the other case.

Correct. I would always recommend to include site-terms in the model.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Group by sequence interaction [General Sta­tis­tics]

Group by sequence interaction, an urban myth?

pristine, genuine, holy, magnificent, inexplicable beautiful variation

I love your subject line!

Group effect, did you miss it?

Group effect, did you miss it?

Group effect: the endless story

Group effect: the endless river

Group effect: the endless river

replicateBE solution with interactions

replicateBE solution with interactions

datasets issues

datasets

datasets

lmer / lme

What do you mean exactly?

What do you mean exactly?

ANOVA acc. to GL

Group by sequence interaction [General Statistics]