Mutasim
☆    

Jordan,
2019-08-07 15:16
(1946 d 11:04 ago)

Posting: # 20484
Views: 11,525
 

 Group by sequence interaction [General Sta­tis­tics]

Dear Members,
What is the explanation for and the impact of having a significant group by sequence interaction in a study with 4 groups and 2 sequences and 4 periods (HVD).
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-08-07 16:35
(1946 d 09:45 ago)

@ Mutasim
Posting: # 20485
Views: 10,109
 

 Group by sequence interaction, an urban myth?

Salam Mutasim,

❝ What is the explanation for …


Chance.

❝ … and the impact of having a significant group by sequence interaction in a study with 4 groups and 2 sequences and 4 periods (HVD).


If you are dealing with ABEL (the EMA’s method) chances than any (!) European agency will ask for a group-term in the analysis are close to nil. When I gave my first presentation about it in Budapest 2017 in front of European regulatory (!) statisticians they were asking in disbelieve “What the hell are you talking about?” See what the BE-GL states:

The study should be designed in such a way that the formulation effect can be distinguished from other effects.
The precise model to be used for the analysis should be pre-specified in the protocol. The statistical analysis should take into account sources of variation that can be reasonably assumed to have an effect on the response variable.

Is it reasonable to assume that groups or sequences have an effect on the PK response? Heck, no way! Same inclusion-/exclusion-criteria and hence, similar demographics, same clinical site, same bioanalytical method. Hundreds (thousands?) of studies performed in multiple groups and accepted by European agencies based on pooled data. OK, I know of two cases where the EMA (centralised procdure) asked for something similar. Both were complicated drugs (a biosimilar and a liposomal product) and multi-centric. Hence, the EMA asked for a site term.*
This is in line what is stated in the Q&A document about Two-Stage Designs:

A model which also includes a term for a formulation*stage interaction would give equal weight to the two stages, even if the number of subjects in each stage is very different. The results can be very misleading hence such a model is not considered acceptable. […] this model assumes that the formulation effect is truly different in each stage. If such an assumption were true there is no single formulation effect that can be applied to the general population, and the estimate from the study has no real meaning.

Replace formulation*stage by formulation*group and you get the idea.

But all of this is about a group-by-treatment interaction. Sometimes in conventional 2×2×2 crossovers we see a significant sequence (better unequal carry-over) effect. Since it can only be avoided by design (sufficiently long wash­out) any test for it is futile. Check for eventual residual concentrations in higher period(s) and exclude subjects with pre-dose concentrations >5% of their Cmax-values. In analogy the same can be said about the group-by-sequence interaction. Occasionally you will see a significant result. Ignore it.
Only if you prefer braces plus suspenders: Go with the FDA’s model II. But no pre-test and no interaction! The loss in power as compared to the pooled analysis is negligible.
Degrees of freedom in a 2×2×2 design:

Model III: n1 + n2 – 2
Model II:  n1 + n2 – (Ngroups – 1) – 2

Or in the 4-period full replicate:

Model III: 3(n1 + n2) – 4
Model II:  3(n1 + n2) – (Ngroups – 1) – 4

Am I right that you had to deal with 128 subjects? Then the 90% CI with model II will be just 0.002% (!) wider than the one with model III.
What you never ever should do: Evaluate groups separately. Power will be terrible. Even if you are extremely lucky and one of them passes, what will you do? Submit only this one? Any agency will ask for the others as well. Guess the outcome.

If you are thinking about the FDA’s group models I–III: There were very few deficiency letters (all with the same text) issued to US and Canadian companies. On the other hand, many European CROs have such letters collecting dust in archives. Politics? Even if you follow this track, model III (pooling) is justified since the conditions are practically always fulfilled. The sequential procedure (test for a group-by-treatment interaction at the 10% level) as any pre-test might inflate the type I error. I even have a statement from the EMA that such a procedure is not acceptable. Group-by-sequence interaction? Forget it.


  • Proposed all-fixed effects model:
    Site+Sequence+Sequence(Site)+Period(Site)+Subject(Sequence×Site)+Treatment

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Denmark,
2019-08-08 12:06
(1945 d 14:13 ago)

@ Helmut
Posting: # 20486
Views: 9,882
 

 pristine, genuine, holy, magnificent, inexplicable beautiful variation

Hi Hötzi,

❝ The precise model to be used for the analysis should be pre-specified in the protocol. The statistical analysis should take into account sources of variation that can be reasonably assumed to have an effect on the response variable.

❝ Is it reasonable to assume that groups or sequences have an effect on the PK response? Heck, no way! Same inclusion-/exclusion-criteria and hence, similar demographics, same clinical site, same bioanalytical method. Hundreds (thousands?) of studies performed in multiple groups and accepted by European agencies based on pooled data.


But on the other hand: Why then then include e.g. period?

Let us for a moment disregard the actual wording. I don't think this is about assuming that some factor has or hasn't an effect (and not about significance in the statistical sense either). As I see it I want to construct a CI which is as wide as my real uncertainty dictates. I start out with a whole bunch of ugly variation and in the fashion of Michelangelo working on his crude blocks of marble I chip parts and bits away from my bulk of variation by applying my model. What I have left of my variation is a chunk of pristine, genuine, holy, magnificent, beautiful variation. What a sight to behold :-), and whose origin my experimental setup cannot account for, which I therefore use for my CI. My confidence interval now is as wide as just exactly that uncertainty merits.

For a crossover this is of little practical importance since subjects are in groups. For a parallel trial I think I want group in the model. If regulators don't like this, they can ask me to take it away. I happily do so without protesting. I am a sheep at that point. But not until then. :-D

Pass or fail!
ElMaestro
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-08-08 12:31
(1945 d 13:49 ago)

@ ElMaestro
Posting: # 20487
Views: 9,900
 

 I love your subject line!

Ahoy, my Capt’n,

❝ ❝ The precise model [:blahblah:]


❝ But on the other hand: Why then then include e.g. period?


[image]Cause otherwise eventual period effects would not mean out. ;-)

Given, sometimes one has to assume lacking period effects and everybody is happy with that. If an originator explores whether the drug follows linear PK, we have a paired design (SD → saturation → steady state) and compare AUC0–τ with AUC0–∞. A crossover would be a logistic nightmare.

❝ Let us for a moment disregard the actual wording. [lengthy beautiful explanation]


Exactly.

❝ […] For a parallel trial I think I want group in the model. If regulators don't like this, they can ask me to take it away. I happily do so without protesting. I am a sheep at that point. But not until then. :-D


Agree again.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Astea
★★  

Russia,
2019-12-21 15:12
(1810 d 10:07 ago)

@ Helmut
Posting: # 21011
Views: 9,125
 

 Group effect, did you miss it?

Dear Friends!

Recently I've known that Belorussian experts referring to EAEC rules require to recalculate the results of large studies using FDA's Model II, that is: Group, Sequence, Treatment, Period(Group), Group×Sequence as fixed and Subject(Group×Sequence) as Random though it was not stated in protocol beforehand.

I don't think it is a good idea. Moreover it could be contagious! What do you think about it?

Critical points: as standard model requires Sequence, Treatment, Period and Subject(Sequence) as fixed terms, the results of the random effect model obviously will be different. What if they will change the overall result of BE? What if the Group×Sequence effect would be significant by chance?

How to deal with excluded subjects? For example, if we have a subject with only one period the fixed-effect model would automatically neglect it while the random-effect model would use it.

As I understand currently it is impossible to calculate it via R, isn't it? May be Julia will help? So only people with commercial software could deal with it.

"Being in minority, even a minority of one, did not make you mad"
PharmCat
★    

Russia,
2019-12-22 02:37
(1809 d 22:42 ago)

@ Astea
Posting: # 21014
Views: 9,150
 

 Group effect, did you miss it?

Hello Astea!

❝ As I understand currently it is impossible to calculate it via R, isn't it? May be Julia will help? So only people with commercial software could deal with it.


I think any mixed model without replication (repeated factor) can be fitted in R and in Julia without problems. If variance components include repeated factor (structured R matrix) only special cases available in free software.
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-12-22 11:30
(1809 d 13:49 ago)

@ Astea
Posting: # 21015
Views: 9,141
 

 Group effect: the endless story

Hi Nastia,

❝ Recently I've known that Belorussian experts referring to EAEC rules require to recalculate the results of large studies using FDA's Model II, that is: Group, Sequence, Treatment, Period(Group), Group×Sequence as fixed and Subject(Group×Sequence) as Random though it was not stated in protocol beforehand.


Regulators are always right. :-D

❝ I don't think it is a good idea. Moreover it could be contagious! What do you think about it?


Recalculating a study is never a good idea. Entire α already spent, right?

❝ Critical points: as standard model requires Sequence, Treatment, Period and Subject(Sequence) as fixed terms, the results of the random effect model obviously will be different.


Possible but only to a (very) small degree.

❝ What if they will change the overall result of BE?


You mean that the fixed model passes and the mixed model fails? Cannot imagine to happen (see below).

❝ What if the Group×Sequence effect would be significant by chance?


Not even the FDA asks for testing Group×Sequence in Model II.

❝ How to deal with excluded subjects? For example, if we have a subject with only one period the fixed-effect model would automatically neglect it while the random-effect model would use it.


Yep. On the other hand, the residual variance in the mixed model should not be larger than the one of fixed model. I had a few cases in the past where European regulators asked for recalculation according to the new GL (the mixed model was my standard before). The CI was sometimes narrower but differed only in the second decimal place…

❝ As I understand currently it is impossible to calculate it via R, isn't it?


I agree with PharmCat. Doable in R.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Astea
★★  

Russia,
2019-12-22 22:23
(1809 d 02:57 ago)

@ Helmut
Posting: # 21018
Views: 9,182
 

 Group effect: the endless river

Dear Breathtakingly Smart People!

❝ I agree with PharmCat. Doable in R.


Please check my solution...
Below is my R-code:
library(readxl)
library(lmerTest)
options(contrasts=c("contr.treatment","contr.poly"))
Dataset<-read_excel("DS_01_group.xlsx", sheet = 1)
Model2 <- function(Dataset){
Dataset$Formulation<-factor(Dataset$Formulation, levels = c("R","T"))
Dataset$Sequence<-factor(Dataset$Sequence, levels = c("TRTR", "RTRT"))
Dataset$Period<-factor(Dataset$Period)
Dataset$Group<-factor(Dataset$Group)
mod.lmer <- (lmer(log(Data)~Sequence+Period/Group+Formulation+Group+Group*Sequence+(1|Subject/(Sequence)), data=Dataset))
FormulationEffect.lmer <- summary(mod.lmer)$coefficients["FormulationT","Estimate"] cat("\n", "PE is", round(exp(as.numeric(FormulationEffect.lmer))*100, digits = 8), "\n")
FormulationT<-summary(mod.lmer)$coefficients["FormulationT",] 
CI_Lr<-round(exp(FormulationT[1]+FormulationT[2]*qt(0.05, FormulationT[3]))*100, 4)
CI_Ur<-round(exp(FormulationT[1]-FormulationT[2]*qt(0.05, FormulationT[3]))*100, 4)
cat("\n", "CI is", CI_Lr, "-", CI_Ur,"\n")}
Model2(Dataset)
 

In order to check whether it works properly I used Data set 1 from Q&A and added group-factor: first 30 subjects - group #1, and the last - group #2.

To compare the results I used the following SAS-code:
proc mixed data=SASuser.Dataset;
class Formulation Subject Period Sequence Group;
model PKlog= Sequence Formulation Period(Group) Group Sequence*Group;
random Subject(Sequence*Group);
estimate "test-ref" Formulation -1 1 / CL alpha=0.10;
run;

Cause I wasn't sure that my code works well I used the step-by-step approach and compared different models (they all may be meaningless by nature but whatever):
1). Sequence Formulation Period as Fixed, Subject(Sequence) as Random
2). Sequence Formulation Period Group as Fixed, Subject(Sequence) as Random
3). Sequence Formulation Period(Group) Group as Fixed, Subject(Sequence) as Random
4). Sequence Formulation Period(Group) Group Sequence*Group as Fixed, Subject(Sequence) as Random
5). Sequence Formulation Period(Group) Group Sequence*Group as Fixed, Subject(Sequence*Group) as Random


The results are as follows:
      SAS         SAS         SAS         R         R            R
#   PE         90%CI, L   90%CI, U   PE         90%CI, L   90%CI, U
1   115,7298   107,1707   124,9725   115,7298   107,1707   124,9725
2   115,7279   107,1680   124,9716   115,7279   107,1679   124,9716
3   115,7208   107,1080   125,0262   115,7208   107,1080   125,0262
4   115,7275   107,1136   125,0340   115,7275   107,1136   125,0340
5   115,7275   107,1136   125,0340   NA         NA         NA


And now, attention, the questions:
1). Looking the same is not proving to be the same, isn't it?
2). By the way, this is exactly an example where the design changes the decision of BE (of course if scaling was not stated in protocol). BE or not BE?
3). When I try to use Subject(Sequence*Group) as Random in R, it throws an error: "couldn't evaluate grouping factor (Sequence * Group):Subject within model frame: try adding grouping factor to data frame explicitly if possible". Is it right that we can simply neglect the structure of this term as we can do in standard models with Subject instead of Subject(Sequence)?

"Being in minority, even a minority of one, did not make you mad"
PharmCat
★    

Russia,
2019-12-22 23:52
(1809 d 01:28 ago)

(edited on 2019-12-23 17:04)
@ Astea
Posting: # 21019
Views: 9,191
 

 Group effect: the endless river

Hi!

❝ Is it right that we can simply neglect the structure of this term as we can do in standard models with Subject instead of Subject(Sequence)?


Edited.

You can use just "Subject" in random statement in SAS. I think if you try to do this, you should get the same results. Of course, if your subject numbers not repeated.

Data:
DATA data;
infile datalines dlm=" " truncover;
input subject $ house $ room $ var;
datalines;
1 1 1 4
1 1 1 6
2 1 1 6
2 1 1 10
3 1 2 2
3 1 2 4
4 1 2 2
4 1 2 3
5 2 3 8
5 2 3 10
6 2 3 6
6 2 3 10
7 2 4 4
7 2 4 6
;
RUN;



Code 1

PROC MIXED data=data;
CLASSES subject house room;
MODEL  var = house room/ DDFM=SATTERTH solution;
RANDOM subject/V G;
ESTIMATE 'house' house -1 1/e;
RUN;


Fixed effect without denoted nested factors: problem with results, but coefficient calculated correctly.

Code 2

PROC MIXED data=data;
CLASSES subject house room;
MODEL  var = house room(house)/ DDFM=SATTERTH solution;
RANDOM subject/V G;
ESTIMATE 'house' house -1 1/e;
RUN;


Fixed effect with nested factor: no problems, but Subjects in Dimensions = 1. Look at horrible V matrix. In this case it does not matter, but it not corresponds with reality.

Code 3

PROC MIXED data=data;
CLASSES subject house room;
MODEL  var = house room(house)/ DDFM=SATTERTH solution;
RANDOM intercept/SUBJECT=subject V G;
ESTIMATE 'house' house -1 1/e;
RUN;


Good fit. Subjects in Dimensions = 8. We have individual V matrices and ready for structuring.

Code 4

PROC MIXED data=data;
CLASSES subject house room;
MODEL  var = house room(house)/ DDFM=SATTERTH solution;
RANDOM intercept/SUBJECT=subject(room) V G;
ESTIMATE 'house' house -1 1/e;
RUN;


Nested random part - no effect.

I can't find any difference between subject and subject(room). As I understand nested factors affect on fixed part. I don't know how nested statment can affect on random part, but I think - in this case there is no effect.
mittyri
★★  

Russia,
2019-12-23 15:30
(1808 d 09:50 ago)

@ Astea
Posting: # 21020
Views: 9,031
 

 replicateBE solution with interactions

Dear Nastya,

whenever possible use simple solutions, at least till the moment you must dive into mixed models galaxy ;-)

And more: interaction terms are not the same in SAS and R
take a look at method.B implementation inside replicateBE package
I just modified it a bit

library(replicateBE)
library(nlme)
alpha <- 0.05
options(contrasts=c("contr.treatment","contr.poly"))
Dataset <- rds01
Dataset$group<- factor(ifelse(as.numeric(levels(Dataset$subject))[Dataset$subject]<31, 1,2))
modB <- lme(log(PK) ~ sequence + group + sequence:group + period + period%in%group + treatment, random = ~1 | subject, na.action = na.omit, data = Dataset)

EMA.B <- summary(modB)
PE <- EMA.B$tTable["treatmentT", "Value"] # exp(PE) = 1.157275
CI <- exp(PE + c(-1, +1) * qt(1 - alpha, EMA.B$tTable["treatmentT", "DF"]) * EMA.B$tTable["treatmentT", "Std.Error"])
# CI = 1.071136 1.250340

Kind regards,
Mittyri
Astea
★★  

Russia,
2019-12-23 18:26
(1808 d 06:53 ago)

@ mittyri
Posting: # 21022
Views: 8,989
 

 replicateBE solution with interactions

Dear mittyri!
Thanks a lot for your elegant solution!
Didn't you think that replicateBE was the first thing that I've thought of, but I couldn't dig up the code for get.data in order to modify it for my needs. The question that I could not solve was extremely stupid: namely, what is the difference between Dataset<-rds01 and Dataset<-read_excel("rds01.xlsx", sheet = 1)? The initial dataset is the same, but the results of the algo are different. So before using we should somehow prepare the data, but how to do it? "I faced the wall...and did it my way" ;-)

"Being in minority, even a minority of one, did not make you mad"
mittyri
★★  

Russia,
2019-12-24 11:53
(1807 d 13:27 ago)

@ Astea
Posting: # 21023
Views: 8,960
 

 datasets issues

Dear Nastya,

❝ what is the difference between Dataset<-rds01 and Dataset<-read_excel("rds01.xlsx", sheet = 1)?

there could be a problem with column types.
check
Dataset1<-read_excel("rds01.xlsx", sheet = 1)
str(Dataset1)
Dataset2<-rds01
str(Dataset2)

Kind regards,
Mittyri
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-12-24 12:03
(1807 d 13:16 ago)

@ Astea
Posting: # 21024
Views: 9,001
 

 datasets

Hi Nastia,

❝ […] replicateBE was the first thing that I've thought of, but I couldn't dig up the code for get.data in order to modify it for my needs.


I would not try to use this function. It calls others which are not exported. Not worth the efforts to modify it.

❝ […] what is the difference between Dataset<-rds01 and Dataset<-read_excel("rds01.xlsx", sheet = 1)? The initial dataset is the same, but the results of the algo are different. So before using we should somehow prepare the data, but how to do it?


Duno how you accessed the datasets. Try this one:

library(replicateBE)
library(readxl)
library(nlme)
path <- "your path"
file <- "rds01.xlsx"
name <- paste0(path, "/", file)
DS1  <- as.data.frame(read_excel(path = name, sheet = 1,
                                 na = c("NA", "ND", ".", "", "Missing"),
                                 skip = 0, col_names = TRUE))
str(DS1) # show the structure: data.frame!
cols <- c("subject", "period", "sequence", "treatment")
DS1[cols] <- lapply(DS1[cols], factor) # factorize
DS2  <- rds01
str(DS2) # show the structure: named S3-object, factorized data.frame
# add the groups as factors
DS1$group <- factor(ifelse(as.numeric(levels(DS1$subject))[DS1$subject]                            < 31, 1, 2))
DS2$group <- factor(ifelse(as.numeric(levels(DS2$subject))[DS2$subject]                            < 31, 1, 2))
res <- data.frame(origin = c("Excel", "replicateBE"),
                  PE = NA, CL.lo = NA, CL.hi = NA)
ow <- options("contrasts") # save options
options(contrasts = c("contr.treatment", "contr.poly"))
on.exit(ow)                # reset options if an error occurs
for (j in 1:nrow(res)) {
  if (j == 1) data = DS1 else data <- DS2
  modB        <- lme(log(PK) ~ sequence + group + sequence:group + period +
                               period%in%group + treatment,
                               random = ~1 | subject,
                               na.action = na.omit, data = data)
  EMA.B       <- summary(modB)
  PE          <- EMA.B$tTable["treatmentT", "Value"]
  res[j, 3:4] <- 100*exp(PE + c(-1, +1) *
                         qt(1 - 0.05, EMA.B$tTable["treatmentT", "DF"]) *
                         EMA.B$tTable["treatmentT", "Std.Error"])
  res$PE[j]   <- 100*exp(PE)
}
print(res, row.names = FALSE) # should be identical

#      origin       PE    CL.lo   CL.hi
#       Excel 115.7275 107.1136 125.034
# replicateBE 115.7275 107.1136 125.034



Edit: Hey Mittyri, you were faster!

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Astea
★★  

Russia,
2019-12-24 20:18
(1807 d 05:02 ago)

@ Helmut
Posting: # 21029
Views: 8,920
 

 datasets

Dear Friends!

Thanks a lot for your inestimable help!

I finally found. The key point was just adding the string: Dataset$period<-factor(Dataset$period). Now all works the same.

I still struggle to understand what is the difference between lmer and lme in this case...

"Being in minority, even a minority of one, did not make you mad"
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-12-25 20:12
(1806 d 05:08 ago)

@ Astea
Posting: # 21030
Views: 9,040
 

 lmer / lme

Hi Nastia,

❝ I still struggle to understand what is the difference between lmer and lme in this case...


The syntax of the formula for random effects is different.

mod <- lmer(logPK  ~ sequence + period + treatment +
                     (1|subject),
                     data = data)
mod <- lme(log(PK) ~ sequence + period + treatment,
                     random = ~1|subject,
                     data = data, na.action = na.omit)

Important: If you have incomplete data, in lme() you have to use na.action = na.omit because the default na.action = na.fail would stop and throw an error.

The method to extract the PE and calculate the CI is also different. Hint: Lines 53–56 and 80–84 of the source-code of method.B.R.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Beholder
★    

Russia,
2019-12-27 14:44
(1804 d 10:36 ago)

@ Astea
Posting: # 21031
Views: 8,747
 

 What do you mean exactly?

Dear Astea!

❝ large studies


Saying "large studies" do you mean the multicentral BE studies or large amount of subjects, e.g. 70 subjects in one clinic divided into two groups?

Best regards
Beholder
Astea
★★  

Russia,
2019-12-29 22:59
(1802 d 02:21 ago)

@ Beholder
Posting: # 21035
Views: 8,656
 

 What do you mean exactly?

Dear Beholder!

❝ Saying "large studies" do you mean the multicentral BE studies or large amount of subjects, e.g. 70 subjects in one clinic divided into two groups?


I'm sorry for the wrong terminology used. I meant studies in which volunteers were divided by two groups for the capacity reasons. (Formally this can be true even for studies with 12 subjects if the clinical site is tiny.)

Multicentral studies is the other case.
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2019-12-30 13:32
(1801 d 11:47 ago)

@ Astea
Posting: # 21036
Views: 8,889
 

 ANOVA acc. to GL

Hi Nastia,

❝ I meant studies in which volunteers were divided by two groups for the capacity reasons. (Formally this can be true even for studies with 12 subjects if the clinical site is tiny.)


Might also happen in mid-range CROs with drugs which require continuous cardiac monitoring. 12–16 beds is not unusual.

Nevertheless, I don’t understand why Belorussian experts ask for a mixed model because the guideline recommends ANOVA:

88. Сравнение исследуемых фармакокинетических параметров проводят с помощью дисперсионного анализа (ANOVA).

Fixed effects are specifically recommended (taking into account effects which can affect the response):

89. Статистический анализ должен принимать во внимание источники вариабельности, способные повлиять на изучаемую переменную. В такой модели дисперсионного анализа принято использовать такие факторы, как последо­вательность, субъект последо­вательности, период и лекарственный препарат. В отношении всех этих факторов следует использовать фиксированные, а не случайные эффекты.


:vomit: We discussed that ad nauseam. At least in the EU no regulatory statistician assumes that groups have an impact and expects a group-model. Pooled analysis rulez (see above).

However, why all that fuzz? Let’s simulate 100,000 studies:

library(PowerTOST)
ngrp <- function(capacity, n) {
  # split sample size into >=2 groups based on capacity
  if (n <= capacity) { # make equal groups
    ngrp <- rep(ceiling(n/2), 2)
  } else {             # at least one = capacity
    ngrp    <- rep(0, ceiling(n / capacity))
    grps    <- length(ngrp)
    ngrp[1] <- capacity
    for (j in 2:grps) {
      n.tot <- sum(ngrp) # what we have so far
      if (n.tot + capacity <= n) {
        ngrp[j] <- capacity
      } else {
        ngrp[j] <- n - n.tot
      }
    }
  }
  return(ngrp = list(grps = length(ngrp), ngrp = ngrp))
}
CV            <- 0.275
target        <- 0.90
theta0        <- 0.95
design        <- "2x2x2"
capacity      <- 12

res           <- data.frame(N = NA, groups = NA, n.group = NA,
                            pwr.1 = NA, pwr.2 = NA, change = NA)
x             <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target,
                              design = design, print = FALSE, details = FALSE)
res$N         <- x[["Sample size"]]
res$pwr.1     <- x[["Achieved power"]]
x             <- ngrp(capacity = capacity, n = res$N)
res$groups    <- x[["grps"]]
n.group       <- x[["ngrp"]]
res$pwr.2     <- power.TOST.sds(CV = CV, n = res$N, grps = res$groups,
                                ngrp = n.group, gmodel = 2, progress = FALSE)
res$n.group   <- paste(n.group, collapse = "|")
res$change    <- 100*(res$pwr.2 - res$pwr.1)/res$pwr.1
res[4:6]      <- signif(res[4:6], 4)
names(res)[3:5] <- c("n / group", "1", "2")
if (res$change < 0) {
  names(res)[6] <- "loss (%)"
} else {
  names(res)[6] <- "gain (%)"
}
res[6]        <- abs(res[6])
txt <- paste0("\nCV = ", CV, ", theta0 = ", theta0, ", targetpower = ",
              target, ",\ndesign = ", design, "; all effects fixed",
              "\npower: 1 = pooled data, 2 = group model\n\n")
cat(txt); print(res, row.names = FALSE)

CV = 0.275, theta0 = 0.95, targetpower = 0.9,
design = 2x2x2; all effects fixed
power: 1 = pooled data, 2 = group model

  N groups  n / group      1      2 loss (%)
 44      4 12|12|12|8 0.9006 0.9004  0.02293

The relative loss in power is practically negligible. Hence, go ahead with the group model to make the experts happy. Point out that a mixed model is against the GL.

But you are right that in rare borderline cases a study passing with the pooled analysis might fail with the group model due to the lower degrees of freedom.
\(df\text{(pooled model)}=N-2\)
\(df\text{(group model)}=\sum_{i=1}^{i=groups}n_i-(groups-1)-2\)
Of course, the larger the sample size and the smaller the number of groups the impact will be decrease.

CV            <- 0.275
theta0        <- 0.95
capacity      <- 12
var           <- CV2mse(CV)
N             <- seq(24, 48, 2)
res           <- data.frame(N = N, df.p = N - 2, tcrit.p = qt(1-0.05, N-2),
                            CI.p = NA, groups = NA, n.group = NA,
                            df.g = NA, tcrit.g = NA, CI.g = NA)
for (j in seq_along(N)) {
  CI             <- round(100*exp(log(theta0) + c(-1, +1) *
                                  res$tcrit.p[j] * sqrt(var/N[j])), 2)
  res$CI.p[j]    <- sprintf("%6.2f\u2013%6.2f%%", CI[1], CI[2])
  x              <- ngrp(capacity = capacity, n = N[j])
  res$groups[j]  <- x[["grps"]]
  n.group        <- x[["ngrp"]]
  res$df.g[j]    <- N[j] - (res$groups[j] - 1) - 2
  res$tcrit.g[j] <- qt(1-0.05, res$df.g[j])
  res$n.group[j] <- paste(n.group, collapse = "|")
  CI             <- round(100*exp(log(theta0) + c(-1, +1) *
                                  res$tcrit.g[j] * sqrt(var/N[j])), 2)
  res$CI.g[j]    <- sprintf("%6.2f\u2013%6.2f%%", CI[1], CI[2])
}
res[, c(3, 8)] <- signif(res[, c(3, 8)], 4)
names(res)[c(4, 9)] <- c("90% CI (pooled)", "90% CI (groups)")
print(res, row.names = FALSE)

  N df.p tcrit.p 90% CI (pooled) groups     n.group df.g tcrit.g 90% CI (groups)
 24   22   1.717   86.42–104.43%      2       12|12   21   1.721   86.40–104.45%
 26   24   1.711   86.77–104.01%      3     12|12|2   22   1.717   86.74–104.04%
 28   26   1.706   87.08–103.64%      3     12|12|4   24   1.711   87.06–103.67%
 30   28   1.701   87.36–103.31%      3     12|12|6   26   1.706   87.34–103.33%
 32   30   1.697   87.61–103.02%      3     12|12|8   28   1.701   87.59–103.04%
 34   32   1.694   87.83–102.75%      3    12|12|10   30   1.697   87.82–102.77%
 36   34   1.691   88.04–102.51%      3    12|12|12   32   1.694   88.03–102.52%
 38   36   1.688   88.23–102.29%      4  12|12|12|2   33   1.692   88.21–102.31%
 40   38   1.686   88.40–102.09%      4  12|12|12|4   35   1.690   88.39–102.11%
 42   40   1.684   88.56–101.90%      4  12|12|12|6   37   1.687   88.55–101.92%
 44   42   1.682   88.71–101.73%      4  12|12|12|8   39   1.685   88.70–101.74%
 46   44   1.680   88.85–101.57%      4 12|12|12|10   41   1.683   88.84–101.58%
 48   46   1.679   88.98–101.42%      4 12|12|12|12   43   1.681   88.98–101.43%

Same game but with a capacity of 24, theta0 0.87. Only two groups and hence, one df lost:

  N df.p tcrit.p 90% CI (pooled) groups n.group df.g tcrit.g 90% CI (groups)
 24   22   1.717   79.14– 95.64%      2   12|12   21   1.721   79.13– 95.65%
 26   24   1.711   79.46– 95.25%      2    24|2   23   1.714   79.45– 95.26%
 28   26   1.706   79.75– 94.91%      2    24|4   25   1.708   79.74– 94.92%
 30   28   1.701   80.00– 94.61%      2    24|6   27   1.703   79.99– 94.62%
 32   30   1.697   80.23– 94.34%      2    24|8   29   1.699   80.22– 94.35%
 34   32   1.694   80.44– 94.10%      2   24|10   31   1.696   80.43– 94.11%
 36   34   1.691   80.63– 93.88%      2   24|12   33   1.692   80.62– 93.88%
 38   36   1.688   80.80– 93.68%      2   24|14   35   1.690   80.79– 93.68%
 40   38   1.686   80.96– 93.49%      2   24|16   37   1.687   80.95– 93.50%
 42   40   1.684   81.11– 93.32%      2   24|18   39   1.685   81.10– 93.33%
 44   42   1.682   81.24– 93.16%      2   24|20   41   1.683   81.24– 93.17%
 46   44   1.680   81.37– 93.02%      2   24|22   43   1.681   81.37– 93.02%
 48   46   1.679   81.49– 92.88%      2   24|24   45   1.679   81.49– 92.88%

Here we have a Nastia-case: With 30 subjects we pass BE with the pooled model (80.00 – 94.61%) but fail with the group model (79.99 – 94.62%).

❝ Multicentral studies is the other case.


Correct. I would always recommend to include site-terms in the model.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
UA Flag
Activity
 Admin contact
23,332 posts in 4,899 threads, 1,660 registered users;
20 visitors (0 registered, 20 guests [including 7 identified bots]).
Forum time: 01:20 CET (Europe/Vienna)

I don’t write drafts.
I write from the beginning to the end,
and when it’s finished, it’s done.    Clifford Geertz

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5