Bioequivalence and Bioavailability Forum

Helmut
★★★

Vienna, Austria,
2017-04-29 02:46
(2544 d 16:22 ago)

Posting: # 17278
Views: 33,218

Russian «Экспертами» and their hobby [Regulatives / Guidelines]

Post reply

Hi Artem,

concerning your question in the other thread:

❝ I need to calculate additional parameter in ANOVA - Cohort factor.

Oh, the hobby of the Russian «Экспертами»…

❝ Then in the case of a EMA Model Specification is:

❝ sequence+subject(sequence)+period+treatment+cohort

❝ Am I right?

I’m afraid, no. The EMA does not specify a model. In the BE-GL we find only:

4.1.1 Study design
The study should be designed in such a way that the formulation effect can be distinguished from other effects.
4.1.8 Evaluation – Statistical analysis
The precise model to be used for the analysis should be pre-specified in the protocol. The statistical analysis should take into account sources of variation that can be reasonably assumed to have an effect on the response variable.

❝ And how Model Specification can be constructed for agencies recommending a mixed-effects model (FDA, Health Canada)?

You find the FDA’s models under the FOI and some members of the forum have a letter with the same wording. The FDA suggests three models (group instead of cohort):

Group, Sequence, Treatment, Subject (nested within Group × Sequence), Period (nested within Group), Group-by-Sequence Interaction, Group-by-Treatment Interaction.
Subject (nested within Group × Sequence) is a random effect and all other effects are fixed effects. Note that intra-subject contrasts for the estimation of the treatment effect (and hence, a PE and its CI) cannot be unbiased obtained from this model. It serves only as a decision tool.
- If the Group-by-Treatment interaction test is not statistically significant^a (p ≥0.1), only the Group-by-Treatment term can be dropped from the model. That means, pool the data and evaluate the study by model #2.
- If the Group-by-Treatment interaction is statistically significant^a (p <0.1), equivalence has to be demonstrated in one of the groups, provided that the group meets minimum requirements for a complete bioequivalence study. That means, no pooling and evaluate the (largest) group only by model #3.
Group, Sequence, Treatment, Subject (nested within Group × Sequence), Period (nested within Group), Group-by-Sequence Interaction.
Again, Subject (nested within Group × Sequence) is a random effect and all other effects are fixed effects.
The model takes the multigroup nature of the study into account and is more conservative than the naïve pooled model (three degrees of freedom less than model #3).
Sequence, Treatment, Period, Subject (nested within Sequence).
Surprise: Subject (nested within Group × Sequence) is a random effect and all other effects are fixed effects.

However, the FDA also states that the simple model #3 (of pooled data) can be applied if all of the following criteria are met:

the clinical study takes place at one site,
all study subjects have been recruited from the same enrollment pool,
all of the subjects have similar demographics, and
all enrolled subjects are randomly assigned to treatment groups at study outset.

I have no idea why the group effect is such a big deal in Russia. Practically the criteria for not using group terms is almost always fulfilled. The nasty thing is that the Group-by-Treatment interaction test has low power (therefore, testing at the 0.1 level). You should expect a false positive rate at the level of the test and trash some of your studies due to lacking power.^b Bizarre.

Since Russia follows the EMA’s footprints, treat subjects as fixed instead of random.^c The decision scheme (i.e., whether data can be pooled or analysis of the largest group is recommended) is applicable as well. It should be noted that in rare cases (e.g., extremely unbalanced sequences) the fixed effects model gives no solution and the mixed effects model has to be used.

In Phoenix/WinNonlin check the Partial Tests for model #1:
Column Hypothesis, row Group*Treatment and its P_value.
Example: CV of AUC 30% (no scaling allowed) but 4-period full replicate to allow scaling of C_max, GMR 0.90, target power 90% → sample size 54. Capacity of the clinical site 24 beds. Three options:
1. Equal group sizes (3×18).
2. Two groups with the maximum size (24) and the remaining one six.
3. One group 24, the remaining two as balanced as possible (16|14).
Let us assume that we are not allowed to pool (significant Group-by-Treatment interaction in model #1) and have to assess BE in the groups. Which powers can we expect?
1. 51% in all groups (n=18 each).
2. 62% in the two large groups (n=24 each).
3. 62% in the largest group (n=24).
Hence, I don’t think that equal group sizes in #1 are a good idea.
#2 looks better but what if one group passes and the other not? If you cherry-pick and present only the passing one I bet that assessors will ask for the other one. What do you think they will conclude?
Therefore, I would suggest #3…
Setup of the models in Phoenix/WinNonlin (map Group as Classification):
1. Group+Sequence+Treatment+Group*Sequence+ Group*Period+Group*Treatment+Subject(Group*Sequence)
2. Group+Sequence+Treatment+Group*Sequence+ Group*Period+Subject(Group*Sequence)
3. Sequence+Treatment+Period+Subject(Sequence)

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

mittyri
★★

Russia,
2017-04-30 00:57
(2543 d 18:12 ago)

@ Helmut
Posting: # 17283
Views: 30,238

Low power of Group-by-Treatment interaction

Post reply

Hi Helmut!

Your opinion is very important for Russian BEBA amateurs, so I'm expecting your approach will be 'carved in Russian stone'. :ok:

It would be great if we get some consensus regarding models with group term (until the moment when our experts will change their mind or, probably, all other world will be convinced by Russian experts). :-D

❝ The nasty thing is that the Group-by-Treatment interaction test has low power (therefore, testing at the 0.1 level). You should expect a false positive rate at the level of the test and trash some of your studies due to lacking power.

Could you please clarify this point? I saw many times the problem of power for Sequence term for simple model and Group-by-Treatment interaction for FDA model I. Is it possible to prove that with sims? Or somebody did this work analytically?

PS: I suspect a lot of fun with replicate designs. Your model specification with group (from Österreich with love :flower:

) works well even there, but it doesn't mean that this model is applicable for replicate designs (as we discussed elsewhere).

—
Kind regards,
Mittyri

Helmut
★★★

Vienna, Austria,
2017-04-30 15:54
(2543 d 03:14 ago)

@ mittyri
Posting: # 17284
Views: 30,300

Let’s forget the Group-by-Treatment interaction, please!

Post reply

Hi mittyri,

❝ Your opinion is very important for Russian BEBAC amateurs, so I'm expecting your approach will be 'carved in Russian stone'

If they are following the forum (are they?) I want to make one point clear:

I do not advocate routinely using the group procedures of the FDA!
On the contrary, all criteria for not using them are usually fulfilled (i.e., the simple model of pooled data can be used).

I did so in dozens of studies without ever getting a single (‼) deficiency letter. And my CRO was just a tiny one… Many thousands of BE studies were accepted by a multitude of agencies without asking for an ‘analysis’ of the group effect.

I would say that the EMA accepts without reservation that the group effect “cannot be reasonably assumed to have an effect on the response variable.”

❝ ❝ The nasty thing is that the Group-by-Treatment interaction test has low power (therefore, testing at the 0.1 level). You should expect a false positive rate at the level of the test …

❝

❝ Could you please clarify this point? I saw many times the problem of power for Sequence term for simple model …

Senn¹ (who always strongly argued against testing the sequence – or better unequal carryover – effect!) writes:

Because the power of the test is low, being based on between-patient difference, a high nominal level of significance (usually 10%) is used.

An interesting statement by the EMA² concerning the treatment by covariate interaction:

The primary analysis should include only the covariates pre-specified in the protocol and no treatment by covariate interaction terms. […] Tests for interactions often lack statistical power and the absence of statistical evidence of an interaction is not evidence that there is no clinically relevant interaction. Conversely, an interaction cannot be considered as relevant on the sole basis of a significant test for interaction. Assessment of interaction terms based on statistical significance tests is therefore of little value [sic].

(my emphases)

❝ … and Group-by-Treatment interaction for FDA model I. Is it possible to prove that with sims? Or somebody did this work analytically?

Don’t know. I’m in contact with a Canadian CRO to collect empiric evidence (like D’Angelo et al.³ did for carryover). We will include only studies where groups were separated by just a couple of days and all of the FDA’s criteria for pooling were fulfilled. A great deal of work but seemingly ~⅒ of studies show a significant group-by-treatment interaction. :crying:

❝ I suspect a lot of fun with replicate designs. Your model specification with group […] works well even there, but it doesn't mean that this model is applicable for replicate designs (as we discussed elsewhere).

Yep.

Senn S. Crossover Trials in Clinical Research. Chichester: Wiley; 2^nd ed. 2002. p. 58.
EMA. Guideline on adjustment for baseline covariates in clinical trials. London: 26 February 2015. EMA/CHMP/295050/2013.
D’Angelo G, Potvin D, Turgeon J. Carryover effects in bioequivalence studies. J Biopharm Stat. 2001; 11(1–2): 35–43. doi:10.1081/BIP-100104196.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2017-05-01 18:19
(2542 d 00:50 ago)

@ Helmut
Posting: # 17286
Views: 30,161

Let’s forget the Group-by-Treatment interaction, please!

Post reply

Hi Hötzi and Mittyri,

this thread is interesting and confusing to me.
May I ask or comment for clarification:

M: "Is it possible to prove that with sims?" - what is it you want to prove? Can you formulate it plain and simple? Sims are totally possible, I just need to figure out the equations, as well as have a purpose. :-D

H: "It should be noted that in rare cases (e.g., extremely unbalanced sequences) the fixed effects model gives no solution and the mixed effects model has to be used." - a realistic linear model will have a single analytical solution unless you make a specification error. Imbalance would not affect that, please describe where/how you came a cross a fit which failed with the lm.

M+H: FDA are also fitting subject as fixed even when using the random statement in PROC GLM. Some of them just have not realised it :-)

H: "(...) seemingly ~⅒ of studies show a significant group-by-treatment interaction. " - this is expected by chance. You apply a 10% significance level. By chance 10% will then be significant.
(and by the way: Which denominator in F did you apply; within or between?)

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2017-05-02 03:10
(2541 d 15:58 ago)

@ ElMaestro
Posting: # 17287
Views: 30,473

Some answers

Post reply

Hi ElMaestro,

❝ M: "Is it possible to prove that with sims?" - what is it you want to prove? Can you formulate it plain and simple? Sims are totally possible, I just need to figure out the equations, as well as have a purpose. :-D

Not M but answering anyway.
The idea behind the Group-by-Treatment interaction is that the T/R in one group is different from the other (i.e., we have collinearity with a “hidden” variable). Therefore, simulate a group of subjects with T/R 0.95 and another one with T/R 0.95^–1 (CV ad libitum). Merge them to get a “study”. Run model 1 and check the p-value of the Group-by-Treatment interaction. With the simple model you should expect T/R 1.

❝ H: "It should be noted that in rare cases (e.g., extremely unbalanced sequences) the fixed effects model gives no solution and the mixed effects model has to be used." - a realistic linear model will have a single analytical solution unless you make a specification error. Imbalance would not affect that, please describe where/how you came a cross a fit which failed with the lm.

I had one data set where the fixed effects model in Phoenix/WinNonlin showed me the finger. Same in JMP (“poor man’s SAS”). Have to check again.

❝ M+H: FDA are also fitting subject as fixed even when using the random statement in PROC GLM. Some of them just have not realised it :-)

True.

❝ H: "(...) seemingly ~10% of studies show a significant group-by-treatment interaction. " - this is expected by chance. You apply a 10% significance level. By chance 10% will then be significant.

Exactly. That’s the idea of assessing real studies. If there would be a true Group-by-Treatment interaction (i.e., not random alone) we could expect significant results in >10% of studies. This is what I have so far (I hope that the Canadians will come up with another ~100).

[image]

Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0012 0.2693 0.5573 0.5206 0.7725 0.9925

[image]

Min. 1st Qu. Median Mean 3rd Qu. Max. 0.00376 0.28731 0.47584 0.49187 0.71717 0.98837

86 studies (60 analytes), 85 data sets for AUC and 86 for C_max, sample sizes 15 to 74, two to four groups, median interval between groups three days. Significant Group-by-Treatment interaction in 8.24% (AUC) and 12.79% (C_max) of data sets. Hence, I guess it is a bloody myth.

❝ (and by the way: Which denominator in F did you apply; within or between?)

Numerator DF = Groups – 1
Denominator DF = Subjects – 2 × Groups

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2017-05-02 11:04
(2541 d 08:05 ago)

@ Helmut
Posting: # 17288
Views: 30,034

Some answers

Post reply

Hi Hötzi,

❝ The idea behind the Group-by-Treatment interaction is that the T/R in one group is different from the other (i.e., we have collinearity with a “hidden” variable). Therefore, simulate a group of subjects with T/R 0.95 and another one with T/R 0.95^–1 (CV ad libitum). Merge them to get a “study”. Run model 1 and check the p-value of the Group-by-Treatment interaction. With the simple model you should expect T/R 1.

Thanks for this.
This sounds reasonable (T/R=1, assuming equal group sizes).

Could you tell how you got your F-test denominator? I am sure you are right, but I don't know where it came from. For an interaction of a between- and within-factor I think the rule of thumb (which is also a wee bit hard to define :-D

) is to test against the within, which in this case would be the model residual.

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2017-05-02 14:35
(2541 d 04:34 ago)

@ ElMaestro
Posting: # 17291
Views: 30,390

Example

Post reply

Hi ElMaestro,

❝ Could you tell how you got your F-test denominator? I am sure you are right, but I don't know where it came from. For an interaction of a between- and within-factor I think the rule of thumb (which is also a wee bit hard to define :-D ) is to test against the within, which in this case would be the model residual.

Yep. Below an example of model 1 in Phoenix/WinNonlin. Two groups (n=24 each), all effects fixed.

Partial Sum of Squares

            Hypothesis DF        SS        MS   F_stat  P_value

---------------------------------------------------------------

                 Group  1 0.0131109 0.0131109 1.0385149 0.31374

              Sequence  1 0.0058638 0.0058638 0.4644731 0.49914

             Treatment  1 0.0011965 0.0011965 0.0947752 0.75964

        Group*Sequence  1 0.0108734 0.0108734 0.8612869 0.35844

          Group*Period  2 0.0160976 0.0080488 0.6375490 0.53340

       Group*Treatment  1 0.0131109 0.0131109 1.0385149 0.31374

Group*Sequence*Subject 44 0.555484  0.0126246 1         0.50000

                 Error 44 0.555484  0.0126246



Partial Tests of Model Effects

            Hypothesis Numer_DF Denom_DF  F_stat   P_value

----------------------------------------------------------

                 Group        1       44 1.0385149 0.31374

              Sequence        1       44 0.4644731 0.49914

             Treatment        1       44 0.0947752 0.75964

        Group*Sequence        1       44 0.8612869 0.35844

          Group*Period        2       44 0.6375490 0.53340

       Group*Treatment        1       44 1.0385149 0.31374

Group*Sequence*Subject       44       44 1         0.50000

N: Σn_G = 48
G: 2
Numerator DF: G – 1 = 1
Denominator DF: N - 2G = 44
F: 0.0131109/0.0126246 = 1.0385149

round(pf(1.0385149, 1, 44, lower.tail=FALSE), 5)

# [1] 0.31374

✔

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

mittyri
★★

Russia,
2017-05-02 20:29
(2540 d 22:40 ago)

@ Helmut
Posting: # 17294
Views: 29,891

Sensitivity of term?

Post reply

Hi Helmut and ElMaestro,

Helmut answered to the question directed to me more accurate than I can ;-)

seems to be reasonable, but I do not see why the power is low?
So if our hypothesis is that the power is low, we need to reject H0 that power is high, in another words to prove that sensitivity of this term to deviations is low.
By the way if the power of this term is low, some other should be high, right? which one? :confused:

❝ ❝ M+H: FDA are also fitting subject as fixed even when using the random statement in PROC GLM. Some of them just have not realised it :-)

❝

❝ True.

AFAIK PHX knows only one model where subject is fitted as fixed term when placed to the variance structure, that's conventional model. In all other cases LinMix will switch to the mixed modeling.

❝ ❝ (and by the way: Which denominator in F did you apply; within or between?)

❝

❝ Numerator DF = Groups – 1

❝ Denominator DF = Subjects – 2 × Groups

Yes, the results are the same for complete data (mixed vs glm)

—
Kind regards,
Mittyri

Helmut
★★★

Vienna, Austria,
2017-05-05 16:38
(2538 d 02:30 ago)

@ mittyri
Posting: # 17305
Views: 30,106

Simulations

Post reply

Hi mittyri,

❝ ❝ The idea behind the Group-by-Treatment interaction is that the T/R in one group is different from the other (i.e., we have collinearity with a “hidden” variable). Therefore, simulate a group of subjects with T/R 0.95 and another one with T/R 0.95^–1 (CV ad libitum). Merge them to get a “study”. Run model 1 and check the p-value of the Group-by-Treatment interaction. With the simple model you should expect T/R 1.

❝

❝ seems to be reasonable, but I do not see why the power is low?

Good question. Next question?

I performed simulations (100,000 2×2×2 studies each for conditions a. and b. specified below). Two groups of 16 subjects each, CV 30%, no period and sequence effects. 32 subjects should give power 81.52% for T/R 1. If the Group-by-Treatment interaction is not significant (p ≥0.1) in model 1, the respective study is evaluated by model 2 (pooled data) or both groups by model 3 otherwise. All studies are evaluated by model 3 (pooled data). The listed PE is the geometric mean of passing studies’ PEs.

T/R in group 1 0.95, T/R in group 2 0.95^–1
(i.e., ‘true’ Group-by-Treatment interaction):
Model 1: p(G×T) <0.1 in 17.91% of studies. Evaluation of studies with p(G×T) <0.1 (Groups): passed model 3 (1) : 1.42% (of tested); PE 98.69% range of PEs: 92.45% to 107.63% passed model 3 (2) : 1.64% (of tested); PE 100.99% range of PEs: 93.99% to 108.23% passed model 3 (1 and 2): 0.00% (of tested) Evaluation of studies with p(G×T) ≥0.1 (pooled): passed model 2 : 66.47% (overall) 80.97% (of tested); PE 99.97% range of PEs: 86.36% to 114.27% Studies passing any of model 2 or 3: 67.02% Criteria for simple model fulfilled: passed model 3 : 80.95%; PE 99.98% range of PEs: 86.36% to 114.68%
T/R in both groups 1.00
(i.e., no Group-by-Treatment interaction):
Model 1: p(G×T) <0.1 in 9.79% of studies. Evaluation of studies with p(G×T) <0.1 (Groups): passed model 3 (1) : 1.86% (of tested); PE 100.28% range of PEs: 93.09% to 108.40% passed model 3 (2) : 1.87% (of tested); PE 100.01% range of PEs: 92.18% to 108.41% passed model 3 (1 and 2): 0.00% (of tested) Evaluation of studies with p(G×T) ≥0.1 (pooled): passed model 2 : 73.33% (overall) 81.28% (of tested); PE 99.98% range of PEs: 86.36% to 114.68% Studies passing any of model 2 or 3: 73.69% Criteria for simple model fulfilled: passed model 3 : 81.40%; PE 99.98% range of PEs: 86.36% to 115.15%

IMHO, equal groups sizes are problematic. What if one group passes and the other fails? Even if one is fishy and present only the passing one, assessors likely would ask for the other one and make a conservative decision. Hoping that both groups will pass is simply futile.

Lessons learned:
If we test at the 10% level and there is no true Group-by-Treatment interaction we will find a significant effect at ~ the level of the test – as expected (b). Hurray, false positives!
On the other hand, if there is one, we will detect it (a).
The percentage of studies passing in models 2 and 3 are similar. Theoretically in model 2 it should be slightly lower than in model 3 (one degree of freedom of the treatment effect less). However, overall power is seriously compromised.

Slowly I get the impression that the evaluation of groups (by model 3) is not a good idea. If there is a true Group-by-Treatment interaction why the heck should the PE (say in the largest group) be unbiased? I would rather say that if one believes that a Group-by-Treatment interaction really exists (I don’t) and the test makes sense (I don’t) evaluation (of the largest group) by model 3 should not be performed. Consequently ~⅒ of (otherwise passing) studies would go into the waste bin. Didn’t I say that before?

The distribution of p-values should be uniform.
Looks good for b.

[image]

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 

0.0000011 0.2517777 0.5002957 0.5008763 0.7508297 0.9999974

Interesting shape for a.

[image]

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 

0.0000001 0.1562932 0.3991516 0.4306846 0.6868190 0.9999981

If you prefer more extreme stuff: T/R in group 1 0.90, T/R in group 2 0.90^–1

Model 1: p(G×T) <0.1 in 40.35% of studies. Evaluation of studies with p(G×T) <0.1 (Groups): passed model 3 (1) : 1.09% (of tested); PE 98.76% range of PEs: 91.69% to 105.97% passed model 3 (2) : 1.06% (of tested); PE 101.40% range of PEs: 94.58% to 108.34% passed model 3 (1 and 2): 0.00% (of tested) Evaluation of studies with p(G×T) ≥0.1 (pooled): passed model 2 : 47.74% (overall) 80.03% (of tested); PE 99.98% range of PEs: 87.24% to 114.13% Studies passing any of model 2 or 3: 48.60% Criteria for simple model fulfilled: passed model 3 : 79.45%; PE 99.99% range of PEs: 87.24% to 114.13%

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 

0.00000 0.03962 0.15648 0.26602 0.42742 0.99997

PS: The code seems to work – at least for the pooled model 3. Comparisons of powers

power.TOST(...)                0.815152

power.TOST.sim(..., nsims=1e5) 0.81437

power.TOST.sim(..., nsims=1e6) 0.815127

My code (nsims=1e5)            0.81402

My code (nsims=1e6)            0.81551

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

mittyri
★★

Russia,
2017-05-09 01:28
(2534 d 17:40 ago)

@ Helmut
Posting: # 17327
Views: 29,445

loosing specificity due to low sensitivity

Post reply

Hi Helmut,

you've made a great work! Won't it be published?
In your examples (simulations/practice) you showed the TxG test is not a good idea.
I was impressed by this:

❝ Model 1: p(G×T) <0.1 in 17.91% of studies.

❝ b. T/R in both groups 1.00

❝ (i.e., no Group-by-Treatment interaction):

❝ Model 1: p(G×T) <0.1 in 9.79% of studies.

❝ If you prefer more extreme stuff: T/R in group 1 0.90, T/R in group 2 0.90^–1

❝ Model 1: p(G×T) <0.1 in 40.35% of studies.

I see that the sensitivity is really low, but I think it is not a good idea to compensate it with low specificity (high false positive).

Once again, thank you very much! Wouldn't you mind to publish the code of data building for simulations?

—
Kind regards,
Mittyri

Helmut
★★★

Vienna, Austria,
2017-05-09 02:55
(2534 d 16:14 ago)

@ mittyri
Posting: # 17329
Views: 29,859

loosing specificity due to low sensitivity

Post reply

Hi mittyri,

❝ you've made a great work!

THX!

❝ Won't it be published?

I hope so. It is on my to-do-list since last summer…

❝ In your examples (simulations/practice) you showed the TxG test is not a good idea.

❝ I was impressed by this: […]

❝

❝ I see that the sensitivity is really low, but I think it is not a good idea to compensate it with low specificity (high false positive).

Right. I have no idea where this idea come from. :confused:

❝ Wouldn't you mind to publish the code of data building for simulations?

I used code developed by Martin and me many years ago to simulate 2×2 designs followed by a simple rbind(part1, part2). I improved it to speed things up (which worked). Unfortunately I screwed up a couple of days ago (no version control, saving over). Shit.
In the meantime you can (mis)use the code Detlew distributed last year to simulate replicate designs. I duplicated his functions. Start with the RTRT|TRTR and set CV_WT = CV_wR = CV_bT = CV_bR. The code below shows the relevant changes after his example and how to sim. For the plot you have to attach package lattice.

set.seed(123456)

G1 <- 0.95 # T/R in group 1

G2 <- 1/G1 # T/R in group 2

mvc1 <- mean_vcov(c("TRTR", "RTRT"), muR=log(100), ldiff=log(G1),

                 sWT=CV2se(0.3), sWR=CV2se(0.3),

                 sBT=CV2se(0.3), sBR=CV2se(0.3), rho=1)

mvc2 <- mean_vcov(c("TRTR", "RTRT"), muR=log(100), ldiff=log(G2),

                 sWT=CV2se(0.3), sWR=CV2se(0.3),

                 sBT=CV2se(0.3), sBR=CV2se(0.3), rho=1)

# get the data

ow    <- options()

nsims <- 1e4

sig   <- pass.2 <- pass.3 <- pass.3.1 <- pass.3.2 <- pass.3.a <- 0

PE2   <- PE3.1 <- PE3.2 <- PE3 <- p.GxT <- numeric(0)

alpha <- 0.05

p.level <- 0.1

L <- 80

U <- 125

sub.seq <- 8 # subjects / sequence (within each group)

for (j in 1:nsims) {

  part1 <- prep_data(seqs=c("TRTR", "RTRT"), rep(sub.seq, 2),

                     metric="PK", dec=3, mvc_list=mvc1)

  part1 <- part1[, !(names(part1) %in% c("seqno", "logval"))]


  part1 <- part1[!part1$period %in% c(3, 4), ]


  part1$sequence[part1$sequence == "TRTR"] <- "TR"

  part1$sequence[part1$sequence == "RTRT"] <- "RT"

  part1$group <- 1

  part2 <- prep_data(seqs=c("TRTR", "RTRT"), rep(sub.seq, 2),

                     metric="PK", dec=3, mvc_list=mvc2)

  part2 <- part2[, !(names(part2) %in% c("seqno", "logval"))]


  part2 <- part2[!part2$period %in% c(3, 4), ]


  part2$sequence[part2$sequence == "TRTR"] <- "TR"

  part2$sequence[part2$sequence == "RTRT"] <- "RT"

  part2$subject <- part2$subject+sub.seq*2

  part2$group <- 2

  study <- rbind(part1, part2)

  study$subject   <- factor(study$subject)

  study$period    <- factor(study$period)

  study$sequence  <- factor(study$sequence)

  study$treatment <- factor(study$treatment)

  study$group     <- factor(study$group)

  options(contrasts=c("contr.treatment", "contr.poly"), digits=12)

  # model 1 of pooled data

  model1 <- lm(log(PK)~group+sequence+treatment+group*sequence+

                       group*period+group*treatment+subject%in%group*sequence,

                       data=study)

  p.GxT[j] <- anova(model1)[["group:treatment", "Pr(>F)"]]



  if (p.GxT[j] >= p.level) { # if no sign. interaction: model 2 of pooled data

    model2 <- lm(log(PK)~group+sequence+treatment+group*sequence+

                         group*period+subject%in%group*sequence,

                         data=study)

    CI2 <- round(100*exp(confint(model2, "treatmentT", level=1-2*alpha)), 2)

    if (CI2[1] >= L & CI2[2] <= U) {

      pass.2 <- pass.2 + 1 # count passing studies

      PE2[pass.2] <- as.numeric(exp(coef(model2)["treatmentT"]))

    }

  } else { # sign. interaction (otherwise): model 3 of both groups

    sig <- sig + 1 # count studies with significant interaction

    # first group (use part1 data)

    model3.1 <- lm(log(PK)~sequence+treatment+period+subject%in%sequence,

                           data=part1)

    CI3.1 <- round(100*exp(confint(model3.1, "treatmentT", level=1-2*alpha)), 2)

    if (CI3.1[1] >= L & CI3.1[2] <= U) {

      pass.3.1 <- pass.3.1 + 1 # count passing studies

      PE3.1[pass.3.1] <- exp(coef(model3.1)["treatmentT"])

    }

    # second group (use part2 data)

    model3.2 <- lm(log(PK)~sequence+treatment+period+subject%in%sequence,

                           data=part2)

    CI3.2 <- round(100*exp(confint(model3.2, "treatmentT", level=1-2*alpha)), 2)

    if (CI3.2[1] >= L & CI3.2[2] <= U) {

      pass.3.2 <- pass.3.2 + 1 # count passing studies

      PE3.2[pass.3.2] <- as.numeric(exp(coef(model3.2)["treatmentT"]))

    }

    # check whether /both/ groups pass (haha)

    if ((CI3.1[1] >= L & CI3.1[2] <= U) & 

        (CI3.2[1] >= L & CI3.2[2] <= U)) pass.3.a <- pass.3.a + 1

  }

  # model 3 of pooled data (simple 2x2x2 crossover)

  model3 <- lm(log(PK)~sequence+treatment+period+subject%in%sequence,

                       data=study)

  CI3 <- round(100*exp(confint(model3, "treatmentT", level=1-2*alpha)), 2)

  if (CI3[1] >= L & CI3[2] <= U) {

    pass.3 <- pass.3 + 1 # count passing studies

    PE3[pass.3] <- as.numeric(exp(coef(model3)["treatmentT"]))

  }

} # end of sim loop

PE2est <- prod(PE2)^(1/length(PE2)) # geom. mean of PEs (passing with model 2)

if (length(PE3.1) > 0) { # geom. mean of PEs (passing with model 3; group 1)

  PE3.1est <- prod(PE3.1)^(1/length(PE3.1))

} else {

  PE3.1est <- NA

}

if (length(PE3.2) > 0) { # geom. mean of PEs (passing with model 3; group 2)

  PE3.2est <- prod(PE3.2)^(1/length(PE3.1))

} else {

  PE3.2est <- NA

}

PE3est <- prod(PE3)^(1/length(PE3)) # geom. mean of PEs (passing with model 3)

options(ow) # restore options

x <- c(0.25, 0.75)

y <- as.numeric(quantile(p.GxT, probs=x))

b <- diff(y)/diff(x)

a <- y[2]-b*x[2]


numsig <- length(which(p.GxT < p.level))

MajorInterval <- 5 # interval for major ticks

MinorInterval <- 4 # interval within major

Major <- seq(0, 1, 1/MajorInterval)

Minor <- seq(0, 1, 1/(MajorInterval*MinorInterval))

labl  <- sprintf("%.1f", Major)

ks    <- ks.test(x=p.GxT, y="punif", 0, 1)

if (G1 != G2) {

  main <- list(label=paste0("\nSimulation of \'true\' interaction\nT/R (G1) ",

                            G1, ", T/R (G2) ", round(G2, 4)), cex=0.9)

} else {

  main <- list(label=paste0("\nSimulation of no interaction\nT/R (G1, G2) ",

                            G1), cex=0.9)

}

if (ks$p.value == 0) {

  sub <- list(label=sprintf("Kolmogorov-Smirnov test: p <%1.5g",

              .Machine$double.eps), cex=0.8)

} else {

  sub <- list(label=sprintf("Kolmogorov-Smirnov test: p %1.5g",

              ks$p.value), cex=0.8)

}

trellis.par.set(layout.widths=list(right.padding=5))

qqmath(p.GxT, distribution=qunif,

  prepanel=NULL,

  panel=function(x) {

    panel.grid(h=-1, v=-1, lty=3)

    panel.abline(h=p.level, lty=2)

    panel.abline(c(0, 1), col="lightgray")

    panel.abline(a=a, b=b)

    panel.qqmath(x, distribution=qunif, col="blue", pch=46) },

    scales=list(x=list(at=Major), y=list(at=Major),

                tck=c(1, 0), labels=labl, cex=0.9),

    xlab="uniform [0, 1] quantiles",

    ylab="p (Group-by-Treatment Interaction)",

    main=main, sub=sub, min=0, max=1)

trellis.focus("panel", 1, 1, clip.off=TRUE)

panel.axis("bottom", check.overlap=TRUE, outside=TRUE, labels=FALSE,

           tck=0.5, at=Minor)

panel.axis("left", check.overlap=TRUE, outside=TRUE, labels=FALSE,

           tck=0.5, at=Minor)

panel.polygon(c(0, 0, numsig/nsims, numsig/nsims, 0),

              c(0, rep(p.level, 2), 0, 0), lwd=1, border="red")

trellis.unfocus()

cat("Model 1: p(G\u00D7T) <0.1 in", sprintf("%.2f%%", 100*sig/nsims),

"of studies.",

"\nEvaluation of studies with p(G\u00D7T) <0.1 (Groups):",

"\n  passed model 3 (1)      :",

  sprintf("%5.2f%%", 100*pass.3.1/sig), "(of tested); PE",

  sprintf("%6.2f%%", 100*PE3.1est),

"\n                                  range of PEs:",

  sprintf("%5.2f%% to %6.2f%%", 100*range(PE3.1)[1], 100*range(PE3.1)[2]),

"\n  passed model 3 (2)      :",

  sprintf("%5.2f%%", 100*pass.3.2/sig), "(of tested); PE",

  sprintf("%6.2f%%", 100*PE3.2est),

"\n                                  range of PEs:",

  sprintf("%5.2f%% to %6.2f%%", 100*range(PE3.2)[1], 100*range(PE3.2)[2]),

"\n  passed model 3 (1 and 2):",

  sprintf("%5.2f%%", 100*pass.3.a/sig), "(of tested)",

"\nEvaluation of studies with p(G\u00D7T) \u22650.1 (pooled):",

"\n  passed model 2          :",

  sprintf("%5.2f%%", 100*pass.2/nsims), "(overall)",

"\n                           ",

  sprintf("%5.2f%%", 100*pass.2/(nsims-sig)), "(of tested); PE",

  sprintf("%6.2f%%", 100*PE2est),

"\n                                  range of PEs:",

  sprintf("%5.2f%% to %6.2f%%", 100*range(PE2)[1], 100*range(PE2)[2]),

"\nStudies passing any of model 2 or 3:",

  sprintf("%5.2f%%", 100*(pass.3.1/nsims+pass.3.2/nsims+pass.2/nsims)),

"\nCriteria for simple model fulfilled:",

"\n  passed model 3          :",

  sprintf("%5.2f%%;             PE", 100*pass.3/nsims),

  sprintf("%6.2f%%", 100*PE3est),

"\n                                  range of PEs:",

  sprintf("%5.2f%% to %6.2f%%", 100*range(PE3)[1], 100*range(PE3)[2]), "\n")

round(summary(p.GxT), 7)

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2017-05-06 19:31
(2536 d 23:37 ago)

@ mittyri
Posting: # 17310
Views: 29,914

Loss in power

Post reply

Hi mittyri and all,

R-code to estimate the loss in power (two groups of equal size). Example for my simulations above and assuming that we will get a significant Group-by-Treatment interaction at the level of the test.

library(PowerTOST) CV <- 0.3 theta0 <- 1 targetpower <- 0.8 p.level <- 0.1 res <- sampleN.TOST(CV=CV, theta0=theta0, targetpower=targetpower, print=FALSE) N <- res[["Sample size"]] if (N >= 12) { # at least minimum sample size acc. to GLs? pwr.mod3pooled <- res[["Achieved power"]] } else { N <- 12 pwr.mod3pooled <- power.TOST(CV=CV, theta0=theta0, n=N) } pwr.mod2 <- suppressMessages(power.TOST(CV=CV, theta0=theta0, n=N-1)*(1-p.level)) pwr.mod3groups <- power.TOST(CV=CV, theta0=theta0, n=N/2)*p.level pwr.mod2and3 <- pwr.mod2+pwr.mod3groups cat(sprintf("CV %5.3f%%, theta0 %.4f, targetpower %.2f%% : sample size %i", 100*CV, theta0, 100*targetpower, N), "\nLevel of the G\u00D7T test (model 1) :", sprintf("% 6.4f", p.level), "\nPower of studies evaluated by model 2 (pooled) :", sprintf("%5.2f%%", 100*pwr.mod2), "\nPower of studies evaluated by model 3 (groups) :", sprintf("%5.2f%%", 100*pwr.mod3groups), "\nModel 2 (pooled) and 3 (groups) combined :", sprintf("%5.2f%%", 100*pwr.mod2and3), "\nPower of studies evaluated by model 3 (pooled) :", sprintf("%5.2f%%", 100*pwr.mod3pooled), "\nLoss in power if simple model 3 cannot be applied:", sprintf("%5.2f%%", 100*(pwr.mod3pooled-pwr.mod2and3)), "\n")

Gives

CV 30.000%, theta0 1.0000, targetpower 80.00% : sample size 32 Level of the G×T test (model 1) : 0.1000 Power of studies evaluated by model 2 (pooled) : 71.80% Power of studies evaluated by model 3 (groups) : 3.25% Model 2 (pooled) and 3 (groups) combined : 75.05% Power of studies evaluated by model 3 (pooled) : 81.52% Loss in power if simple model 3 cannot be applied: 6.47%

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2017-05-08 21:02
(2534 d 22:07 ago)

@ mittyri
Posting: # 17321
Views: 29,845

Interval between groups

Post reply

Hi mittyri and all,

the p-value of the Group-by-Treatment interaction seemingly does not depend on the interval between groups. In 51% of the studies the interval was three days or less but in some substantially longer (i.e., steady state studies where the clinic was occupied). The bubbles’ area, the linear regression, and the loess curve are scaled/weighed by the sample size.

[image]

slope: 0.011196 (p 0.165)

[image]

slope: 0.010589 (p 0.186)

Therefore, we should not worry. The FDA defines “greatly separated in time” as “months apart, for example”.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

mittyri
★★

Russia,
2017-05-09 01:40
(2534 d 17:29 ago)

@ Helmut
Posting: # 17328
Views: 29,528

IMP handling

Post reply

Hi Helmut,

I suppose the problem could be not in the case of 'significant separation in time' but in case of some mistakes in IMP handling.
For example, RIMP has a proven stability up to 30C and TIMP up to 25 only. Due to some "why bother" attitude the designated employee missed it. As a result the second group will be treated with poor TIMP. I assume here that the order of groups treatment is
GR1PER1
GR1PER2
GR2PER1
GR2PER2
The CRO's are usually mixing the time for groups for more effective time management.

So I think in case of appropriate IMP handling we wouldn't observe any real (not false-positive) interaction.
Please correct me if I'm wrong here.

—
Kind regards,
Mittyri

Helmut
★★★

Vienna, Austria,
2017-05-09 03:08
(2534 d 16:00 ago)

@ mittyri
Posting: # 17330
Views: 29,985

IMP handling

Post reply

Hi mittyri,

❝ I suppose the problem could be not in the case of 'significant separation in time' but in case of some mistakes in IMP handling.

❝ For example, RIMP has a proven stability up to 30C and TIMP up to 25 only. Due to some "why bother" attitude the designated employee missed it. As a result the second group will be treated with poor TIMP. I assume here that the order of groups treatment is

❝ GR1PER1

❝ GR1PER2

❝ GR2PER1

❝ GR2PER2

That’s what I would call a “stacked approach”.
IMHO, not a good idea for single dose but might be necessary in steady state studies if the capacity of the clinical site is limited.

❝ The CRO's are usually mixing the time for groups for more effective time management.

Yep – the “staggered approach” keeps the interval as short as possible.

60% of my data sets had an interval of less then seven days. In most of my single dose studies the interval was one to three days.

❝ So I think in case of appropriate IMP handling we wouldn't observe any real (not false-positive) interaction.

Agree.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2017-05-14 19:22
(2528 d 23:46 ago)

@ mittyri
Posting: # 17352
Views: 29,287

Loss in power

Post reply

Hi mittyri,

continuing the evaluation of the data sets of this post.
Background: Some studies are quite dated (the oldest performed in October 1992!). In those days a pre-specified acceptance range of 75.00–133.33% (or even 70.00–142.86%) was acceptable for C_max. However, I evaluated all data sets for the common 80.00–125.00%. This explains why more than the expected 20% failed (no, I didn’t screw up the design ;-)

). If ever possible I tried to avoid equal group sizes but kept one as large as possible. Didn’t succeed sometimes. Working in a CRO, the sponsor is always right…
All data sets were evaluated by model 3 for pooled data (like in the reports – I never cared about groups) and by model 1 to get the p value of the Group-by-Treatment interaction.

If p (G×T) ≥0.1 the pooled data was evaluated by model 2.
If p (G×T) <0.1 the largest group(s) were evaluated by model 3.
If there were more than one large group with equal sizes, both had to pass since I expect assessors will ask for it.

Here the results:

86 studies, 60 analytes, data sets: 85 (AUC), 86 (Cmax). Evaluated by model 1 (all effects fixed); p (G×T) <0.1: AUC : 8.24% ( 7/85) Cmax: 12.79% (11/86) Summary of passing results. AUC : model 2 (pooled) : 84.62% (66/78) model 2 (pooled without pre-test) : 84.71% (72/85) loss (compared to pooled model 3) : 1.18% ( 1/85) model 3 (largest group) : 85.71% ( 6/ 7) model 2 (pooled) or model 3 (largest group): 84.71% (72/85) loss (compared to pooled model 3) : 1.18% ( 1/85) model 3 (pooled) : 85.88% (73/85) CV (range) : 21.36% (4.59–61.73%) Cmax: model 2 (pooled) : 62.67% (47/75) model 2 (pooled without pre-test) : 63.95% (55/86) loss (compared to pooled model 3) : 0.00% ( 0/86) model 3 (largest group) : 27.27% ( 3/11) model 2 (pooled) or model 3 (largest group): 58.14% (50/86) loss (compared to pooled model 3) : 5.81% ( 5/86) model 3 (pooled) : 63.95% (55/86) CV (range) : 27.88% (6.82–76.99%)

The loss in power if we follow the FDA’s procedure (compared to the pooled model 3) is lower than I expected. Surprise. A possible explanation is that studies were usually powered for C_max. Therefore, already the largest groups passed AUC.
On another note: If we apply model 2 without a pre-test (maybe the best way to go for regulators insisting in a group-term) the loss in power compared to the pooled model 3 is negligible. Reasonable, since we lost only few residual degrees of freedom:
pooled model 3: DF=n₁+n₂–2
pooled model 2: DF=n₁+n₂-(N_groups–1)–2

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2017-05-25 17:26
(2518 d 01:43 ago)

@ ElMaestro
Posting: # 17418
Views: 29,693

No convergence in JMP and Phoenix WinNonlin

Post reply

Hi ElMaestro,

❝ ❝ It should be noted that in rare cases (e.g., extremely unbalanced sequences) the fixed effects model gives no solution and the mixed effects model has to be used.

❝

❝ a realistic linear model will have a single analytical solution unless you make a specification error. Imbalance would not affect that, please describe where/how you came a cross a fit which failed with the lm.

I was right about failing in JMP and Phoenix WinNonlin. ;-)

Sorry I can’t disclose the data set. Naïve pooling was performed. Deficiency letter by the MHRA in summer 2016:

The applicant should present estimates and 95% confidence interval for the difference between the Test and the Reference product on a ratio scale from ANOVA model, that reflects the design of the study, with terms for Group, Sequence, Sequence * Group, Subject (Sequence * Group), Period (Group), Treatment as fixed effects.

Note that this is the FDA’s model 2 with fixed effects. Why the 95% CI instead of the 90% CI was required is another story. The data set (subjects fixed) did not converge in JMP. Switched to random and all was good. Was accepted by the MHRA’s assessor.

Phoenix showed me the finger with the fixed effect Subject(Sequence*Group) in Model 1

[image]

and execution stopped (no results at all).
In model 2 I got the same warning as above but these results:

Partial Sum of Squares Hypothesis DF SS MS F_stat P_value --------------------------------------------------------------------------- Group 2 0.0758837 0.0379418 3.48232 0.0381 Sequence 1 0.0708455 0.0708455 6.50224 0.0138 Group*Sequence 2 0.145263 0.0726313 6.66614 0.0026 Sequence*Group*Subject 50 7.67886 0.153577 14.0954 0.0000 Group*Period 3 0.0111135 0.0037045 0.340001 0.7965 Treatment 1 0.144129 0.144129 13.2283 0.0006 Error 52 0.566569 0.0108956 Partial Tests of Model Effects Hypothesis Numer_DF Denom_DF F_stat P_value -------------------------------------------------------------- Group 2 52 3.48232 0.0381 Sequence 1 52 6.50224 0.0138 Group*Sequence 2 52 6.66614 0.0026 Sequence*Group*Subject 50 52 14.0954 0.0000 Group*Period 3 52 0.340001 0.7965 Treatment 1 52 13.2283 0.0006

End of the story. No LSMs. Hence, no difference, no CI…

No problems in R.
Model 1:

Analysis of Variance Table Response: log(Cmax) Df Sum Sq Mean Sq F value Pr(>F) group 2 0.078270 0.03913490 3.54171 0.03643331 * sequence 1 0.073106 0.07310604 6.61611 0.01312377 * treatment 1 0.141465 0.14146461 12.80257 0.00078035 *** group:period 3 0.011114 0.00370450 0.33526 0.79988452 group:sequence 2 0.145263 0.07263128 6.57314 0.00292116 ** group:treatment 2 0.014083 0.00704174 0.63728 0.53296911 group:sequence:subject 50 7.678856 0.15357712 13.89875 < 2.22e-16 *** Residuals 50 0.552485 0.01104970

Model 2:

Model 2:Analysis of Variance Table Response: log(Cmax) Df Sum Sq Mean Sq F value Pr(>F) group 2 0.078270 0.03913490 3.59182 0.03458136 * sequence 1 0.073106 0.07310604 6.70971 0.01241215 * treatment 1 0.141465 0.14146461 12.98370 0.00070289 *** group:period 3 0.011114 0.00370450 0.34000 0.79647341 group:sequence 2 0.145263 0.07263128 6.66614 0.00264706 ** group:sequence:subject 50 7.678856 0.15357712 14.09540 < 2.22e-16 *** Residuals 52 0.566569 0.01089555

Diving deeper into it. Originally I set up the models in Phoenix WinNonlin’s Bioequivalence module, which sits on top of Linear Mixed Effects. When I send the data directly to Linear Mixed Effects (all fixed) no error, no warning, nada. CI identical to the one from R to 12 significant digits.
Conclusion: Bug in Phoenix WinNonlin’s Bioequivalence module.

[image]

BTW: Running model 1 of my 85 data sets (5,004 subjects) in Bioequivalence takes more than ten hours and sucks up almost my entire 16 GB RAM (memory leak?). Direct execution in Linear Mixed Effects takes five minutes (max. RAM consumption 175 MB).
Much slower than R, which takes five seconds for model 1, model 2, model 3 (for each group), and model 3 (pooled).

I was wrong. Has nothing to do with unbalanced sequences and/or unequal group sizes.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2017-05-25 18:24
(2518 d 00:44 ago)

@ Helmut
Posting: # 17419
Views: 28,591

Ouch?!???

Post reply

Hi Hötzi,

I am not a WNL/Phoenix user, but if your post is correct then I imagine around 100 CROs as of today will need to change their software validation status from PQ'ed to "unknown" etc, contact the vendor, await a response and in the meantime do everything they can to study the potential impact on data generated (which they cannot necessarily do unless they have other 'validated' software that achieves the same)?
Only the software developer, who has the source, will be able to tell if there is a bug, and if there is, if the bug affects other models, and how/when. This is not good. Man, I don't even quite know what validated means anymore.

—
Pass or fail!
ElMaestro

Artem Gusev
☆

Russia, Moscow,
2017-05-02 18:13
(2541 d 00:56 ago)

@ Helmut
Posting: # 17292
Views: 29,864

Russian «Экспертами» and their hobby

Post reply

Hi, Helmut, Mittyri and ELMaestro!

Thanks for alot of usefull information!

I've made some calculation with the models from your post. BE Study with 44 subj (2 groups by 22 sbj) in Phoenix 6.4 (Formulation=Treatment, FF = Cmax/AUC).

Step 1.

Model Fixed Effects: Group+Sequence+Formulation+Group*Sequence+Group*Period+Group*Formulation.
Model Random Effects: Subject(Group*Sequence).

Partial Test:

Dependent     Hypothesis          Numer_DF   Denom_DF     F_stat        P_value

Ln(Cmax)      Group*Formulation       1      39.658597    0.027673298   0.86872445

Ln(AUClast)   Group*Formulation       1      39.208625    0.5880992     0.44774829

Ln(FF)        Group*Formulation       1      40.02772     0.064521326   0.80078777

P_value is good, but is this normaly that DF is not integer?

Step 2.

Model Fixed Effects: Group+Sequence+Formulation+Group*Sequence+Group*Period.
Model Random Effects: Subject(Group*Sequence).

Partial Test:

Dependent    Hypothesis       Numer_DF  Denom_DF     F_stat       P_value



Ln(Cmax)     int              1         40.921725    11484.381    0

Ln(Cmax)     Group            1         40.93824     0.88167054   0.35325184

Ln(Cmax)     Sequence         1         40.921725    0.18361757   0.67052981

Ln(Cmax)     Formulation      1         40.672472    0.046689365  0.83000761

Ln(Cmax)     Group*Sequence   1         40.93824     0.25766245   0.61445462

Ln(Cmax)     Group*Period     2         40.655533    0.94380823   0.39750275

Ln(AUClast)  int              1         41.090066    8381.629     0

Ln(AUClast)  Group            1         41.094871    1.1907365    0.28153775

Ln(AUClast)  Sequence         1         41.090066    1.4454616    0.23614044

Ln(AUClast)  Formulation      1         40.206255    2.9963913    0.091118919

Ln(AUClast)  Group*Sequence   1         41.094871    0.65923753   0.42150764

Ln(AUClast)  Group*Period     2         40.202418    0.41157679   0.66536478

Ln(FF)       int              1         40.650657    139.60377    1E-14

Ln(FF)       Group            1         40.662987    0.54340035   0.46525932

Ln(FF)       Sequence         1         40.650657    3.1149598    0.085088655

Ln(FF)       Formulation      1         40.989423    1.0744673    0.30601523

Ln(FF)       Group*Sequence   1         40.662987    0.75424068   0.39023396

Ln(FF)       Group*Period     2         40.969735    1.9475433    0.15560361

P_value is also acceptable, but is this normaly that DF here is also not integer?

—
Best Regards,
Artem

mittyri
★★

Russia,
2017-05-02 19:53
(2540 d 23:16 ago)

@ Artem Gusev
Posting: # 17293
Views: 30,152

be careful with mixed models

Post reply

Hi Artem,

Welcome to the world of mixed modeling!
Helmut suggested (see endnote c) to switch to the model with all effects as fixed. Your model is the same as suggested by FDA (with Subject as random). So Phoenix switched to the mixed model and used Satterthwaite degrees of freedom (which could be not integer).
Note that it is impossible to provide ANOVA tables (partial/sequential ss output PHX-speaking) for mixed models.

So due to no convention in Russian brains you are free to use your current model or the model provided by Helmut (with all effects fixed). Latter is more suitable for me due to hazelnut brain.

PS: are there any incomplete data (missed period) for some subjects?

—
Kind regards,
Mittyri

Artem Gusev
☆

Russia, Moscow,
2017-05-03 13:02
(2540 d 06:07 ago)

@ mittyri
Posting: # 17300
Views: 30,142

be careful with mixed models

Post reply

Hi, Mittyri!

I've tried some fixed models after posting previous reply. Situation with DF has improved.
Also I checked the data, its fine.
It was strange for me because standard Phoenix model (Fixed: PRD, TRT, SEQ; Random: Subj(SEQ)) gives integer DF on same dataset. Now it makes clearer, so nvm.

The deeper you are into the statistics, the more terrible it becomes.

Thanks for help.

—
Best Regards,
Artem

Helmut
★★★

Vienna, Austria,
2017-05-05 16:48
(2538 d 02:20 ago)

@ Artem Gusev
Posting: # 17306
Views: 29,878

p-value(s) in model 2

Post reply

Hi Artem,

as mittyri already pointed out you should use fixed effect models in Phoenix/WinNonlin.

❝ Step 2.

❝ P_value is also acceptable, …

There are no “acceptable” p-values in model 2. Any one is just fine. Only in model 1 check the p-value of the Group-by-Treatment interaction.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2017-05-24 22:17
(2518 d 20:51 ago)

@ Helmut
Posting: # 17408
Views: 29,045

Russian «Экспертами» following the EEU GLs

Post reply

Hi Artem and all,

❝ Oh, the hobby of the Russian «Экспертами»…

I have to correct myself. They are blindly following guidelines of the Eurasian Economic Union (Nov 2016, Dec 2015). Practically a 1:1 translation of the FDA’s guidance:

Исследования в нескольких группах

94. Если исследование проведено в двух и более группах и эти группы изучались в различных клинических центрах или в одном и том же центре, но были разделены большим промежутком времени (например, месяцами), возникает сомнение относительно возможности объединения результатов, полученных этих группах, в один анализ. Такие ситуации необходимо обсуждать с уполномоченным органом.
Если предполагается проведение исследования в нескольких группах из логистических соображений, об этом необходимо явно указать в протоколе исследования; при этом, если в отчете отсутствуют результаты статистического анализа, учитывающие многогрупповой характер исследования, необходимо представить научное обоснование отсутствия таких результатов.
93. Если перекрестное исследование проведено в 2 и более группах субъектов, т.е. разбиение всей выборки на несколько групп, каждая из которых начинает участие в исследовании в разные дни (например, если из логистических соображений единовременно в клиническом центре можно провести исследование с участием ограниченного числа субъектов), в целях отражения многогруппового характера исследования необходимо модифицировать статистическую модель. В частности, в модели необходимо учесть тот факт, что периоды для первой группы отличаются от периодов для второй (и последующих) группы.

Does any of our Russian members know whether the above was ever accepted? This section talks about giving in the report a justification for not performing such an analysis. Has anybody ever tried to give the justification already in the protocol? If yes, what happened? If no, why not?

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Beholder
★

Russia,
2017-05-25 00:37
(2518 d 18:31 ago)

@ Helmut
Posting: # 17410
Views: 28,667

Russian «Экспертами» following the EEU GLs

Post reply

Hello Helmut!

❝ Does any of our Russian members know whether the above was ever accepted? This section talks about giving in the report a justification for not performing such an analysis. Has anybody ever tried to give the justification already in the protocol? If yes, what happened? If no, why not?

You are citing too fresh doc I think. The doc came into force on 6th of May. So strictly speaking it was not obligatory to use draft of the doc during clinical trial conducting before 6th of May. So, nobody used it I think and no experience was gathered.

But I would try it))

If Im not mistaken, you wrote something about such algorithm somewhere in forum but I could not find it.

—
Best regards
Beholder

mittyri
★★

Russia,
2017-05-25 10:52
(2518 d 08:17 ago)

@ Beholder
Posting: # 17412
Views: 28,763

Penalty for carelessness

Post reply

Dear Helmut, Dear Beholder,

@Helmut

❝ ❝ Does any of our Russian members know whether the above was ever accepted? This section talks about giving in the report a justification for not performing such an analysis. Has anybody ever tried to give the justification already in the protocol? If yes, what happened? If no, why not?

I'd name the group effect as a 'penalty for carelessness'. After some hot discussions last week I understood that's what experts are waiting for since they do not want to dial back.
So some time ago (about 3 years ago) 'group' trend appeared in their mind. The experts asked after reports submission: group? group? group?
On the stage of request on report it was almost impossible to justify the model without groups. By the way now when this topic is very popular, the team who's developing the protocol should include the justification regarding absence of group effect in the model. Otherwise 'groupshot' is very likely.

@Beholder

❝ If Im not mistaken, you wrote something about such algorithm somewhere in forum but I could not find it.

Here you go

Edit: Changed to internal link; see also this post #7. [Helmut]

—
Kind regards,
Mittyri

Beholder ★ Russia, 2017-05-25 12:43 (2518 d 06:25 ago) @ Beholder Posting: # 17413 Views: 28,667	Russian «Экспертами» following the EEU GLs Post reply
	❝ If Im not mistaken, you wrote something about such algorithm somewhere in forum but I could not find it. yes, found post regarding the EEU GL, which I mentioned. Edit: Changed to internal link; see also this post #7. [Helmut] — Best regards Beholder

Mikalai
★

Belarus,
2018-01-04 11:43
(2294 d 06:26 ago)

@ Helmut
Posting: # 18138
Views: 26,078

Russian «Экспертами» following the EEU GLs

Post reply

Dear all
My name is Mikalai, and I am responsible for the conduct of bioequivalence studies in a medium-sized private pharmaceutical company in Belarus. Due to logistic issues (a small clinical center and a highly variable drug) we have to conduct a bioequivalence study in multiple groups (two). Our competent authority requires a justification not to include the group effect in the proposed statistical model. The groups will be separated by a week at maximum. It seems that we meet criteria set out by FDA to use a statistical model without including the group effect. Our competent authority can accept the FDA position on this issue, but we should properly reference it.

Thus, where can I find this information under FOI (link) or might it be possible that someone can share a copy of letter signed by Barbara Davit where it is outlined requirements to ignore the group effect in a statistical model?

Any help will be appreciated.
Sincerely, Mikalai

Edit: I moved your post from an answer to this one, deleted your email address, and activated personal messages in your profile instead. [Helmut]

Helmut
★★★

Vienna, Austria,
2018-01-04 14:08
(2294 d 04:01 ago)

@ Mikalai
Posting: # 18139
Views: 25,916

Belarus = member of the EEU

Post reply

Hi Mikali,

❝ Due to logistic issues (a small clinical center and a highly variable drug) we have to conduct a bioequivalence study in multiple groups (two).

Are you aiming at reference-scaling for C_max, i.e., perform the study in a replicate design? Even if not, opt for the “staggered approach” – not the “stacked” one (see above).

❝ Our competent authority requires a justification not to include the group effect in the proposed statistical model.

IMHO, stupid – but according to the GL. :-(

❝ The groups will be separated by a week at maximum.

Very good.

❝ It seems that we meet criteria set out by FDA to use a statistical model without including the group effect. Our competent authority can accept the FDA position on this issue, but we should properly reference it.

See this presentation summarizing my current thinking. Note that (since I don’t speak Russian) my remarks given on slide 16 are only partly correct. A justification in the protocol (as you rightly mentioned) should be acceptable. In the discussion following my presentation it became clear that:

Pooling (i.e., model III without a justification in the protocol) lead to rejection of the study. A justification in the report only was never consider sufficient by – Russian – experts.
Nobody tried model II without a pre-test (this would be a much better option than the FDA’s step-wise models). Why? Duno. No inflation of the Type I Error and the loss in power would be limited (see my small meta-analysis above) and would be compliant with the FDA’s 2001 guidance Section VII.A.:
- […] the statistical model should be modified to reflect the multigroup nature of the study. In particular, the model should reflect the fact that the periods for the first group are different from the periods for the second group.

❝ Thus, where can I find this information under FOI (link) …

FDA’s step-wise models (which I would never ever use) are given here and there (maybe there are some more; too lazy to google). However, in the second document you find Comment 9:

If ALL of the following criteria are met, it may not be necessary to include Group-by-Treatment in the statistical model:

the clinical study takes place at one site;
all study subjects have been recruited from the same enrollment pool;
all of the subjects have similar demographics;
all enrolled subjects are randomly assigned to treatment groups at study outset.

In this latter case, the appropriate statistical model would include only the factors Sequence, Period, Treatment and Subject (nested within Sequence).

Note that “the appropriate statistical model in this later case” is the conventional model for a 2×2×2 crossover.

❝ … might it be possible that someone can share a copy of letter signed by Barbara Davit where it is outlined requirements to ignore the group effect in a statistical model?

I can’t share mine (chained to my table by a CDA). Maybe Detlew can share his. Note that the wording of Barbara’s letter is identical to the second reference above.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Mikalai
★

Belarus,
2018-01-04 20:49
(2293 d 21:20 ago)

@ Helmut
Posting: # 18142
Views: 25,871

Belarus = member of the EEU

Post reply

Dear Helmut,
Thank you very much.

Yes, we plan to use the "staggered approach". We used to plan the replicative design, but a large dropout was observed during our last bioequivalence study (crossover design but with long blood sampling interval) due to completely unrelated to the study reasons. Thus, the company decided to put on hold the replicative design given that a clinical center can accommodate only a bit more than 35 volunteers.

We are opting for the model III and will try to use the FDA requirements to justify the statistical model in the protocol. Bioequivalence studies done in accordance with international standards are relatively new for Belarus. As a result, experience, training, and lore are issues for major players (manufacturers, regulators, clinical investigators). We also cannot run studies outside the Eurasian Economic Union. Thus, from time to time our regulators have to rely on opinions of more experienced colleagues, mainly from EMA and FDA. If they see that something is acceptable in Europe or the USA, they usually give us a green light. That why we need papers or proper references.
Regards,
Mikalai.

mittyri ★★ Russia, 2018-01-04 23:04 (2293 d 19:04 ago) @ Helmut Posting: # 18143 Views: 25,637	Trying your model for EEU Post reply
	Dear Helmut, You are Key Opinion Leader in Russia (not limited to Russia I believe) ❝ • Nobody tried model II without a pre-test (this would be a much better option than the FDA’s step-wise models). Why? Duno. Why do you think that nobody has tried? Talked to some involved guys, they are trying and waiting for experts feedback — Kind regards, Mittyri

Helmut
★★★

Vienna, Austria,
2018-01-05 01:06
(2293 d 17:02 ago)

@ mittyri
Posting: # 18145
Views: 25,655

Trying your model for EEU

Post reply

Hi Mittyri,

❝ You are Key Opinion Leader in Russia (not limited to Russia I believe)

Hhm.

❝ ❝ • Nobody tried model II without a pre-test (this would be a much better option than the FDA’s step-wise models). Why? Duno.

❝

❝ Why do you think that nobody has tried? :-D

In Yaroslavl I specifically asked the participants. Maybe the ones you know were there but didn’t want to come up in front of the experts?

❝ Talked to some involved guys, they are trying and waiting for experts feedback

Great. Let’s keep our fingers crossed.

Also in Yaroslavl people encouraged me to publish my meta-study. Well, I’m still collecting data ( Astea). They also suggested a Russian Journal. I don’t like that, since seemingly Russian is more ambiguous than English. Originally I thought of the Journal of Biopharmaceutical Statistics (where Pina D’Angelo’s article about carry-over was published). When I presented about Multi-Group Studies at the 2^nd Annual Biosimilars Forum (October 2017) everybody (and this was a statistical audience from agencies, the industry, and CROs) was surprised that it is an issue at all. Nobody (!) would expect anything than simple pooling. The consensus was that the FDA’s stage-wise procedure might even inflate the Type I Error and should be avoided. Given that I guess such a manuscript will be rejected right away due to its doubtful content. :-D

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Astea
★★

Russia,
2018-01-10 13:09
(2288 d 05:00 ago)

@ Helmut
Posting: # 18156
Views: 25,138

help us to stop it, please...

Post reply

Dear Helmut! ()

The problem will remain untill the Eurasian Economic Union requirements will be corrected or modified. And instead of forgetting this subject as a nightmare we see the inverse tendention: new economic association involves new countries into this problem. Now Belarus was dragged into this story... What's next? It might be stopped just after a scientific paper by the respected author will announce the uselessness of this stuff...

—
"Being in minority, even a minority of one, did not make you mad"

Beholder ★ Russia, 2018-01-10 13:49 (2288 d 04:19 ago) @ Astea Posting: # 18158 Views: 25,613	help us to stop it, please... Post reply
	Dear colleagues, ❝ ... What's next? ... Next is Kazakhstan, then Kyrgyzstan, then Armenia... — Best regards Beholder

d_labes ★★★ Berlin, Germany, 2018-01-10 16:15 (2288 d 01:54 ago) @ Astea Posting: # 18161 Views: 25,191	regulators convinced by science? Post reply
	Dear Astea! ❝ ... It might be stopped just after a scientific paper by the respected author will announce the uselessness of this stuff... Do you really believe that regulators (aka «Экспертами») can be convinced by science? May the Lord preserve your infant faith . — Regards, Detlew

Beholder ★ Russia, 2018-01-10 18:14 (2287 d 23:55 ago) @ d_labes Posting: # 18163 Views: 25,049	regulators convinced by science? Post reply
	Dear d_labes! ❝ Do you really believe that regulators (aka «Экспертами») can be convinced by science? ❝ May the Lord preserve your infant faith . "But I Tried, Didn't I? Goddamnit, at Least I Did That" - McMurphy ("One Flew Over the Cuckoo's Nest", 1975.) — Best regards Beholder

d_labes ★★★ Berlin, Germany, 2018-01-10 19:53 (2287 d 22:15 ago) @ Beholder Posting: # 18164 Views: 25,072	Чёрт побери! Post reply
	Dear beholder! ❝ "But I Tried, Didn't I? Goddamnit, at Least I Did That" - McMurphy ("One Flew Over the Cuckoo's Nest", 1975.) Swearing is of the чёрт — Regards, Detlew

Astea ★★ Russia, 2018-01-10 20:10 (2287 d 21:58 ago) @ d_labes Posting: # 18165 Views: 25,097	regulators convinced by science? Post reply
	Dear d_labes! "You may say that I am a dreamer, but I am not the only one..." Thanks for the support, Beholder! — "Being in minority, even a minority of one, did not make you mad"

d_labes ★★★ Berlin, Germany, 2018-01-10 21:18 (2287 d 20:50 ago) @ Astea Posting: # 18166 Views: 25,233	Excuse me Post reply
	Dear Astea! First: Call me Detlew. You are also allow to pronounce it Detluuu. Up to now only my wife is allowed to do so . ❝ "You may say that I am a dreamer, but I am not the only one..." ❝ ❝ Thanks for the support, Beholder! Excuse me, the old grumpy buffer. You are young. And dreaming is the privilege of the youth. — Regards, Detlew

Astea ★★ Russia, 2018-01-10 21:38 (2287 d 20:31 ago) @ d_labes Posting: # 18167 Views: 25,146	Excuse me Post reply
	Dear Detlew! Thank you! I am not so young to believe in fairy tales, but I am not so old not to believe in mind. I see how things change to the better end on my own eyes just because of the presence of not indifferent scientific people. P.S. You can call me Nastia (similar to nasty - easy to remember) — "Being in minority, even a minority of one, did not make you mad"

Russian «Экс­пер­тами» and their hobby [Regulatives / Guidelines]

Low power of Group-by-Treatment interaction

Let’s forget the Group-by-Treatment interaction, please!

Let’s forget the Group-by-Treatment interaction, please!

Some answers

Some answers

Example

Sensitivity of term?

Simulations

loosing specificity due to low sensitivity

loosing specificity due to low sensitivity

Loss in power

Interval between groups

IMP handling

IMP handling

Loss in power

No convergence in JMP and Phoenix WinNonlin

Ouch?!???

Russian «Экс­пер­тами» and their hobby

be careful with mixed models

be careful with mixed models

p-value(s) in model 2

Russian «Экс­пер­тами» following the EEU GLs

Russian «Экс­пер­тами» following the EEU GLs

Penalty for carelessness

Russian «Экс­пер­тами» following the EEU GLs

Russian «Экс­пер­тами» following the EEU GLs

Belarus = member of the EEU

Belarus = member of the EEU

Trying your model for EEU

Trying your model for EEU

help us to stop it, please...

help us to stop it, please...

regulators convinced by science?

regulators convinced by science?

Чёрт побери!

regulators convinced by science?

Excuse me

Excuse me

Russian «Экспертами» and their hobby [Regulatives / Guidelines]

Russian «Экспертами» and their hobby

Russian «Экспертами» following the EEU GLs

Russian «Экспертами» following the EEU GLs

Russian «Экспертами» following the EEU GLs

Russian «Экспертами» following the EEU GLs