Weidson
☆    

Brazil,
2021-12-30 21:16
(21 d 23:50 ago)

Posting: # 22720
Views: 494
 

 Help for ANOVA from Package replicateBE [R for BE/BA]

Dear friends,


By studying the replicateBE package documentation https://github.com/Helmut01/replicateBE#readme I was able to verify that the method.A and method.B functions (package replicateBE)are derived from the functions lm and lme (packages stats and nlme).

If possible I would like to reproduce the same ANOVA tables that are displayed in the output of the methods.A and method.B using only the lm and lme functions as indicated in the instructions. Unfortunately I am not reaching my goal of obtain the same results as expected:no:. The hipotetical data set that I working is from replicated 2x4 study and the ANOVA results to be different (GL, p-values, etc...). What am I doing wrong? Why can't I generate the same ANOVA Table? The my results to be bellow:

Somebody can help me?

I Would like publish at forum my data set that generates this results however I didn't see option for upload files excel (just figures). Is it possible publish excel files? If yes, How?


R Code 1.0
method.A with IPC_Drug.Cmax Data Set


method.A(path.in = path, path.out = path, file = paste0(fn, ".", "Cmax"),  ext = "csv", na = "NA", sep = sep, dec = dec, ola = TRUE, print = TRUE, verbose = TRUE, ask = TRUE, plot.bxp = TRUE)

ANOVA Outputs for R Code 1.0:

Data set IPC_Drug.Cmax: Method A by lm()
────────────────────────────────────────
Type III Analysis of Variance Table

Response: log(PK)
                           Df    Sum Sq       Mean Sq    F value      Pr(>F)
sequence                   1      1.3253      1.32531    0.27224    0.604233
period                     3      3.5824      1.19413    2.19909    0.090673
treatment                  1      2.7923      2.79228    5.14222    0.024818
sequence:subject           48   233.6721      4.86817    8.96514    < 2e-16
Residuals                 146    79.2796      0.54301                 

treatment T – R:
  Estimate Std. Error    t value   Pr(>|t|)
-0.2380370  0.1049710 -2.2676500  0.0248184
146 Degrees of Freedom


R Code 1.1
Obtain ANOVA by lm() function with DS_Cmax Data Set


library(stats)
Fit_A <- lm(log(Cmax) ~ Sequence + Subject%in%Sequence + Period + Treatment, data=DS_Cmax)
anova(Fit_A)


ANOVA Outputs for R Code 1.1:

Analysis of Variance Table

Response: log(Cmax)
                        Df       Sum Sq           Mean Sq           F value       Pr(>F)
Sequence                 1       1.32531015       1.325310155       0.83233     0.36273
Period                   1       0.31093557       0.310935573       0.19528     0.65905
Treatment                1      3.41095080        3.410950796       2.14218     0.14492
Sequence:Subject         2      6.70138392        3.350691961       2.10433     0.12470
Residuals                194   308.90305612       1.592283794

R Code 2.0
method.B with IPC_Drug.Cmax Data Set


method.B(path.in = path, path.out = path, file = paste0(fn, ".", "Cmax"),     ext = "csv", na = "NA", sep = sep, dec = dec, ola = TRUE,  print = TRUE, verbose = TRUE, ask = TRUE, plot.bxp = TRUE)


ANOVA Outps for utR Code 2.0:

Data set IPC_Drug.Cmax: Method B (option = 2) by lme()
──────────────────────────────────────────────────────
Response: log(PK)
                numDF       denDF       F-value              p-value
(Intercept)      1           146      763.258972853        <.0001
sequence         1            48      0.272239993          0.6042
period           3           146      2.199092956          0.0907
treatment        1           146      5.142222444          0.0248

treatment T – R:
    Value Std.Error   t-value   p-value
-0.238040  0.104970 -2.267600  0.024818
146 Degrees of Freedom (equivalent to SAS’ DDFM=CONTAIN)

R Code 2.1
Obtain ANOVA by lme() function with DS_Cmax Data Set


library(nlme)
Fit_B_opt2 <- lme (log(Cmax)~ Sequence +  Period + Treatment, random = ~1|Subject, data=DS_Cmax)
anova(Fit_B_opt2)


ANOVA Outputs for R Code 2.1:

               numDF       denDF       F-value            p-value
(Intercept)     1           148       763.258972846       <.0001
Sequence        1            48         0.272239993       0.6042
Period          1           148         0.561664145       0.4548
Treatment       1           148         6.161433208       0.0142
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2021-12-30 21:23
(21 d 23:43 ago)

@ Weidson
Posting: # 22721
Views: 449
 

 Help for ANOVA from Package replicateBE

Hi Weidson,

» […] I didn't see option for upload files excel (just figures).

Correct.

» Is it possible publish excel files? If yes, How?

No. Email them to me and I will upload them together with a link for other users.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Weidson
☆    

Brazil,
2021-12-30 21:43
(21 d 23:23 ago)

@ Helmut
Posting: # 22722
Views: 435
 

 Help for ANOVA from Package replicateBE

Hi Helmut

Thank you very much for the quick reply!!! I will send for you now.

Weidson
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2021-12-30 23:45
(21 d 21:21 ago)

@ Weidson
Posting: # 22723
Views: 429
 

 No ANOVA by lme() – lengthy reply

Hi Weidson and others who want to give it a try…

Download and save to a suitable folder. Depending on your operating system, browser, and localization possibly you will be asked to open the file in Excel or another spreadsheet. Don’t. Just save it. Then:

# Check whether the package is installed in the minimum required version.
# If yes, attach it. If not, stop and ask the user for download/installation.

pack <- "replicateBE"                         # Required package
inst <- installed.packages()[pack, "Version"] # Is it installed?
if (length(inst) == 0) {                      # No
  stop ("Please download/install the package \'replicateBE\' from CRAN.")
} else {
  suppressMessages(require(replicateBE))      # Attach it.
  if (inst < "1.1.0")                         # Check version.
    stop ("At least version 1.1.0 of package \'replicateBE\' required.",
          "\n       Please download/install the current version from CRAN.")
}
path.out <- path.in <- "E:/Public/Documents/BEBAC/R/Weidson" # adapt the folder accordingly
method.A(path.in = path.in, path.out = path.out, file = "IPC_Drug.Cmax",
         ext = "csv", sep = "\t", dec = ".", ola = TRUE, verbose = TRUE)
method.B(path.in = path.in, path.out = path.out, file = "IPC_Drug.Cmax",
         ext = "csv", sep = "\t", dec = ".", ola = TRUE, verbose = TRUE)


With method.A(...) you should get in the console:

Data set IPC_Drug.Cmax: Method A by lm()
────────────────────────────────────────
Type III Analysis of Variance Table

Response: log(PK)
                  Df   Sum Sq Mean Sq F value   Pr(>F)
sequence           1   1.3253 1.32531 0.27224 0.604233
period             3   3.5824 1.19413 2.19909 0.090673
treatment          1   2.7923 2.79228 5.14222 0.024818
sequence:subject  48 233.6721 4.86817 8.96514  < 2e-16
Residuals        146  79.2796 0.54301                 

treatment T – R:
  Estimate Std. Error    t value   Pr(>|t|)
-0.2380370  0.1049710 -2.2676500  0.0248184
146 Degrees of Freedom


And with method.B(...):

Data set IPC_Drug.Cmax: Method B (option = 2) by lme()
──────────────────────────────────────────────────────
Response: log(PK)
            numDF denDF       F-value p-value
(Intercept)     1   146 763.258972853  <.0001
sequence        1    48   0.272239993  0.6042
period          3   146   2.199092956  0.0907
treatment       1   146   5.142222444  0.0248

treatment T – R:
    Value Std.Error   t-value   p-value
-0.238040  0.104970 -2.267600  0.024818
146 Degrees of Freedom (equivalent to SAS’ DDFM=CONTAIN)


The point estimates $$\small{100\,\exp(-0.238037)\approx78.81735\%\;vs\;100\,\exp(-0.23804)\approx78.81712\%}$$ and standard errors (\(\small{0.104971\;vs\;0.104970}\)) are very similar because the data set is complete (no missing periods). Results are identical after rounding acc. to the guidelines (see mine for Method A and Method B).

You don’t need library(stats) because it is attached by default.
Try print(sessionInfo(), locale = FALSE). I got after startup:

R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.1.2


Reproducing your results:

file    <- "E:/Public/Documents/BEBAC/R/Weidson/IPC_Drug.Cmax.csv"
DS_Cmax <- read.csv(file = file, sep = "\t", dec = "." , header = TRUE)
Fit_A   <- lm(log(PK) ~ Sequence + Subject %in% Sequence +
                        Period + Treatment, data = DS_Cmax)
str(DS_Cmax)
print(anova(Fit_A), digits = 10, signif.stars = FALSE)

'data.frame':   200 obs. of  5 variables:
 $ Subject  : int  2 2 2 2 3 3 3 3 5 5 ...
 $ Period   : int  1 2 3 4 1 2 3 4 1 2 ...
 $ Treatment: chr  "T" "R" "T" "R" ...
 $ Sequence : chr  "TRTR" "TRTR" "TRTR" "TRTR" ...
 $ PK       : num  557 525 119 201 544 ...

Analysis of Variance Table

Response: log(PK)
                  Df       Sum Sq     Mean Sq F value  Pr(>F)
Sequence           1   1.32531015 1.325310155 0.83233 0.36273
Period             1   0.31093557 0.310935573 0.19528 0.65905
Treatment          1   3.41095080 3.410950796 2.14218 0.14492
Sequence:Subject   2   6.70138392 3.350691961 2.10433 0.12470
Residuals        194 308.90305612 1.592283794

Here lm() is left out in the rain because we have only a simple data frame. The degrees of freedom and Mean Squares are wrong. Therefore, we need to factorize effects:

# columns which need to be factorized
cols          <- c("Subject", "Period", "Sequence", "Treatment")
DS_Cmax[cols] <- lapply(DS_Cmax[cols], factor)
Fit_A.corr    <- lm(log(PK) ~ Sequence + Subject %in% Sequence +
                              Period + Treatment, data = DS_Cmax)
TypeI         <- anova(Fit_A.corr)
str(DS_Cmax)
print(TypeI, digits = 6, signif.stars = FALSE)

'data.frame':   200 obs. of  5 variables:
 $ Subject  : Factor w/ 50 levels "2","3","5","8",..: 1 1 1 1 2 2 2 2 3 3 ...
 $ Period   : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
 $ Treatment: Factor w/ 2 levels "R","T": 2 1 2 1 2 1 2 1 2 1 ...
 $ Sequence : Factor w/ 2 levels "RTRT","TRTR": 2 2 2 2 2 2 2 2 2 2 ...
 $ PK       : num  557 525 119 201 544 ...

Analysis of Variance Table

Response: log(PK)
                  Df   Sum Sq Mean Sq F value   Pr(>F)
Sequence           1   1.3253 1.32531 2.44067 0.120391
Period             3   3.5824 1.19413 2.19909 0.090673
Treatment          1   2.7923 2.79228 5.14222 0.024818
Sequence:Subject  48 233.6721 4.86817 8.96514  < 2e-16
Residuals        146  79.2796 0.54301

That’s the default, where Sequence is tested against Residuals. Now change to the correct test:

TypeIII       <- TypeI # use what we have
attr(TypeIII, "heading")[1] <- "Type III Analysis of Variance Table\n"
MSdenom       <- TypeIII["Sequence:Subject", "Mean Sq"]
df2           <- TypeIII["Sequence:Subject", "Df"]
fvalue        <- TypeIII["Sequence", "Mean Sq"] / MSdenom
df1           <- TypeIII["Sequence", "Df"]
TypeIII["Sequence", 4] <- fvalue
TypeIII["Sequence", 5] <- pf(fvalue, df1, df2, lower.tail = FALSE)
print(TypeIII, digits = 6, signif.stars = FALSE)

Analysis of Variance Table

Response: log(PK)
                  Df   Sum Sq Mean Sq F value   Pr(>F)
Sequence           1   1.3253 1.32531 2.44067 0.120391
Period             3   3.5824 1.19413 2.19909 0.090673
Treatment          1   2.7923 2.79228 5.14222 0.024818
Sequence:Subject  48 233.6721 4.86817 8.96514  < 2e-16
Residuals        146  79.2796 0.54301

Finally we get what replicateBE method.A() provides.

Two more comments:
  1. As shown above, Fit_A <- lm() followed by anova(Fit_A) will give by default (in SAS lingo) Type I estimates. That’s not what we want (see there and how to get a Type III ANOVA in [image] and Phoenix Win­Non­lin). We changed that in v1.1.0 of replicateBE last June (see here and there).
  2. In a mixed-effects model there are no least squares estimates cause the optimization is done by (restricted) maximum likelihood. IMHO, anova((lme(foo ~ bar)) is bit unfortunate cause it is not an ANOVA (see there).
    • When only one fitted model object is present, a data frame with the numerator degrees of freedom, denominator degrees of freedom, F-values, and P-values for Wald tests for the terms in the model (when Terms and L are NULL) …
    Furthermore, you get only the fixed effects in the data frame, i.e., not subjects, which are a random effect. Note also that in lme() you get automatically the correct test for sequence.
[image]BTW, the nested model subject(sequence) doesn’t make sense for uniquely coded subjects (see there). That’s more evident if we use aov() instead of anova(), showing in an additional line if effects are not estimable.

nested <- lm(log(PK) ~ Sequence + Subject %in% Sequence +
                       Period + Treatment, data = DS_Cmax)
print(aov(nested))

                 Sequence    Period Treatment Sequence:Subject Residuals
Sum of Squares    1.32531   3.58239   2.79228        233.67209  79.27956
Deg. of Freedom         1         3         1               48       146

Residual standard error: 0.7368926

50 out of 104 effects not estimable
Estimated effects may be unbalanced

# not nested
simple <- lm(log(PK) ~ Sequence + Subject + Period + Treatment, data = DS_Cmax)
print(aov(simple))

                 Sequence   Subject    Period Treatment Residuals
Sum of Squares    1.32531 233.67209   3.58239   2.79228  79.27956
Deg. of Freedom         1        48         3         1       146

Residual standard error: 0.7368926

1 out of 55 effects not estimable
Estimated effects may be unbalanced

# without sequence
very.simple <- lm(log(PK) ~ Subject + Period + Treatment, data = DS_Cmax)
print(aov(very.simple))

                  Subject    Period Treatment Residuals
Sum of Squares  234.99740   3.58239   2.79228  79.27956
Deg. of Freedom        49         3         1       146

Residual standard error: 0.7368926
Estimated effects may be unbalanced

In all models (even the one without Sequence) the MSE of Period, and Treatment and the Residual Error are identical. Hence, you will get the same PE and its CI…

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
mittyri
★★  

Russia,
2022-01-01 11:50
(20 d 09:16 ago)

@ Helmut
Posting: # 22724
Views: 333
 

 SSIII ANOVA in Winnonlin

Hi Helmut,

» see there and how to get a Type III ANOVA in [image] and Phoenix Win­Non­lin

just a little reminder: as it is shown here, if Subject(Sequence) term is placed in Variance Structure tab (not in Fixed effects tab) as it is done by default, Partial SS will give SSIII table, additional Test Nom/Denom is not required

Kind regards,
Mittyri
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2022-01-01 21:52
(19 d 23:14 ago)

@ mittyri
Posting: # 22725
Views: 315
 

 SSIII ANOVA in Winnonlin

Hi mittyri,

» just a little reminder: as it is shown here, if Subject(Sequence) term is placed in Variance Structure tab (not in Fixed effects tab) as it is done by default, Partial SS will give SSIII table, additional Test Nom/Denom is not required

Correct, of course. :-D Needed only for the victims of ‘all effects fixed’.

As I wrote above …

» » Note also that in lme() you get automatically the correct test for sequence.

… same in the [image]-package replicateBE, function method.B().

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Weidson
☆    

Brazil,
2022-01-02 14:51
(19 d 06:15 ago)

(edited by Weidson on 2022-01-02 15:12)
@ Helmut
Posting: # 22726
Views: 293
 

 No ANOVA by lme() – lengthy reply

Helmut,

Thank you very much for answer!!! It was very complete!!!:clap:

please, look the ANOVA table results that was generate by proposed code (no changes were made with the sequence effect):

» attr(TypeIII, "heading")[1] <- "Type III Analysis of Variance Table\n"
» MSdenom <- TypeIII["sequence:subject", "Mean Sq"]
» df2 <- TypeIII["sequence:subject", "Df"]
» fvalue <- TypeIII["sequence", "Mean Sq"] / MSdenom
» df1 <- TypeIII["sequence", "Df"]
» TypeIII["sequence", 4] <- fvalue
» TypeIII["sequence", 5] <- pf(fvalue, df1, df2, lower.tail = FALSE)
» print(TypeIII, digits = 6, signif.stars = FALSE)
»
» Analysis of Variance Table
»
» Response: log(PK)
»                   Df   Sum Sq Mean Sq F value   Pr(>F)
» Sequence           1   1.3253 1.32531 2.44067 0.120391
» Period             3   3.5824 1.19413 2.19909 0.090673
» Treatment          1   2.7923 2.79228 5.14222 0.024818
» Sequence:Subject  48 233.6721 4.86817 8.96514  < 2e-16
» Residuals        146  79.2796 0.54301



So I made a small modification in your code:

Fit_1 <- lm(log(Cmax) ~ Sequence + Subject%in%Sequence + Period + Treatment , data=Cmax)
Type_I<-anova(Fit_1)
Type_III       <- Type_I # use what we have
attr(Type_III, "heading")[1] <- "Type III Analysis of Variance Table\n"
MSdenom       <- Type_III[4, "Mean Sq"]
df2           <- Type_III[4, "Df"]
fvalue        <- Type_III[1, "Mean Sq"] / MSdenom
df1           <- Type_III[1, "Df"]
Type_III["Sequence", 4] <- fvalue
Type_III["Sequence", 5] <- pf(fvalue, df1, df2, lower.tail = FALSE)
print(Type_III, digits = 6, signif.stars = TRUE)



And here, the new results

#########################################################################
Type III Analysis of Variance Table

Response: log(Cmax)
                  Df   Sum Sq Mean Sq F value   Pr(>F)   
Sequence           1   1.3253 1.32531 0.27224 0.604233   
Period             3   3.5824 1.19413 2.19909 0.090673 . 
Treatment          1   2.7923 2.79228 5.14222 0.024818 * 
Sequence:Subject  48 233.6721 4.86817 8.96514  < 2e-16 ***
Residuals        146  79.2796 0.54301                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


One last doubt: How can we include at function lme() a term for carryover first-order effect (keeping the sequence effect in the model)? We have many DF for estimate this effect, so I presume that it can be estimated to replicate designs as this our example!!! How and how many columns should we have in the dataset to estimate the first-order carryover effect?

This test can be useful when we are working with auto-inducers drugs (I read your article and I like very much).


At Phoenix WinNonlin this can be make creating two collumns named "Carry" and "Over". The collumn "Carry" is composed by 0 (if data of period 1) or 1 (if data period 2,3,4 ...). You should define "Carry" as covariate and "Over" as Classification. The collumn "Over" indicates which treatment was administered in the previous period. The fixed model: Sequence+Treatment+Period+Carry*over. The specification for random effects that I use is the same oriented by FDA for replicate design (Repeted specification=Period, Variance Bloking Variable=Subject, Group=Treatment (Type=Variance components), Random effects model=Treatment, Variance blocking Variables=Subject (Type=Banded No-Diagonal Factor Analytic f=2):ok:

How would be this parametrization at function lme at R software?:confused:
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2022-01-02 15:49
(19 d 05:17 ago)

@ Weidson
Posting: # 22727
Views: 289
 

 No CO, please

Hi Weidson,

» Thank you very much for answer!!!

Welcome.

» It was very complete!!!:clap:

No, it wasn’t. Just realized that I worked with the source-code of method.A(), where the columns are in lower case (and not in upper case like in your data set). Corrected in my post above. Sorry.

» And here, the new results
»
» #########################################################################
» Type III Analysis of Variance Table
»
» Response: log(Cmax)
»                   Df   Sum Sq Mean Sq F value   Pr(>F)   
» Sequence           1   1.3253 1.32531 0.27224 0.604233   
» Period             3   3.5824 1.19413 2.19909 0.090673 . 
» Treatment          1   2.7923 2.79228 5.14222 0.024818 * 
» Sequence:Subject  48 233.6721 4.86817 8.96514  < 2e-16 ***
» Residuals        146  79.2796 0.54301                     
» ---
» Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Perfect.

» One last doubt: How can we include at function lme() a term for carryover first-order effect (keeping the sequence effect in the model)? We have many DF for estimate this effect, so I presume that it can be estimated to replicate designs as this our example!!! How and how many columns should we have in the dataset to estimate the first-order carryover effect?

Please forget carryover.

» This test can be useful when we are working with auto-inducers drugs (I read your article and I like very much).

Nope. See there again and think about the carbamazepine example.* I had a straightforward induction model behind, i.e., \(\small{\textrm{I}(t)=1\,(-\exp(-\alpha\cdot t)\,)}\). With \(\small{\alpha=1}\) we have practically complete induction (99.48%) on day 5 (Clearance going from 1.35 down to 5.84). Clearly that’s nonlinear and even if we would manage to introduce 1st, 2nd, nth-order carryover in the model, it has no relationship to the real world.

» At Phoenix WinNonlin this can be make creating two collumns named "Carry" and "Over". […]

Never did that and never will. :-)

» How would be this parametrization at function lme at R software? :confused:

No idea.


  • Tóthfálusi L, Endrényi. Approvable generic carbamazepine formulations may not be bioequivalent in target patient populations. Int J Clin Pharmacol Ther. 2013; 51(6): 525–8. doi:10.5414/CP201845.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Activity
 Admin contact
21,834 posts in 4,569 threads, 1,554 registered users;
online 16 (1 registered, 15 guests [including 7 identified bots]).
Forum time: Friday 21:06 CET (Europe/Vienna)

No problem can stand the assault of sustained thinking.    Voltaire

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5