Bioequivalence and Bioavailability Forum • No ANOVA by lme()

No ANOVA by lme() – lengthy reply [🇷 for BE/BA]

posted by Helmut – Vienna, Austria, 2021-12-31 00:45 (1212 d 09:43 ago) – Posting: # 22723
Views: 4,691

Hi Weidson and others who want to give it a try…

Download and save to a suitable folder. Depending on your operating system, browser, and localization possibly you will be asked to open the file in Excel or another spreadsheet. Don’t. Just save it. Then:

# Check whether the package is installed in the minimum required version. # If yes, attach it. If not, stop and ask the user for download/installation. pack <- "replicateBE" # Required package inst <- installed.packages()[pack, "Version"] # Is it installed? if (length(inst) == 0) { # No stop ("Please download/install the package \'replicateBE\' from CRAN.") } else { suppressMessages(require(replicateBE)) # Attach it. if (inst < "1.1.0") # Check version. stop ("At least version 1.1.0 of package \'replicateBE\' required.", "\n Please download/install the current version from CRAN.") } path.out <- path.in <- "E:/Public/Documents/BEBAC/R/Weidson" # adapt the folder accordingly method.A(path.in = path.in, path.out = path.out, file = "IPC_Drug.Cmax", ext = "csv", sep = "\t", dec = ".", ola = TRUE, verbose = TRUE) method.B(path.in = path.in, path.out = path.out, file = "IPC_Drug.Cmax", ext = "csv", sep = "\t", dec = ".", ola = TRUE, verbose = TRUE)

With method.A(...) you should get in the console:

Data set IPC_Drug.Cmax: Method A by lm() ──────────────────────────────────────── Type III Analysis of Variance Table Response: log(PK) Df Sum Sq Mean Sq F value Pr(>F) sequence 1 1.3253 1.32531 0.27224 0.604233 period 3 3.5824 1.19413 2.19909 0.090673 treatment 1 2.7923 2.79228 5.14222 0.024818 sequence:subject 48 233.6721 4.86817 8.96514 < 2e-16 Residuals 146 79.2796 0.54301 treatment T – R: Estimate Std. Error t value Pr(>|t|) -0.2380370 0.1049710 -2.2676500 0.0248184 146 Degrees of Freedom

And with method.B(...):

Data set IPC_Drug.Cmax: Method B (option = 2) by lme() ────────────────────────────────────────────────────── Response: log(PK) numDF denDF F-value p-value (Intercept) 1 146 763.258972853 <.0001 sequence 1 48 0.272239993 0.6042 period 3 146 2.199092956 0.0907 treatment 1 146 5.142222444 0.0248 treatment T – R: Value Std.Error t-value p-value -0.238040 0.104970 -2.267600 0.024818 146 Degrees of Freedom (equivalent to SAS’ DDFM=CONTAIN)

The point estimates $$\small{100\,\exp(-0.238037)\approx78.81735\%\;vs\;100\,\exp(-0.23804)\approx78.81712\%}$$ and standard errors ($\small{0.104971\;vs\;0.104970}$) are very similar because the data set is complete (no missing periods). Results are identical after rounding acc. to the guidelines (see mine for Method A and Method B).

You don’t need library(stats) because it is attached by default.
Try print(sessionInfo(), locale = FALSE). I got after startup:

R version 4.1.2 (2021-11-01) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 Matrix products: default attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.1.2

Reproducing your results:

file <- "E:/Public/Documents/BEBAC/R/Weidson/IPC_Drug.Cmax.csv" DS_Cmax <- read.csv(file = file, sep = "\t", dec = "." , header = TRUE) Fit_A <- lm(log(PK) ~ Sequence + Subject %in% Sequence + Period + Treatment, data = DS_Cmax) str(DS_Cmax) print(anova(Fit_A), digits = 10, signif.stars = FALSE) 'data.frame': 200 obs. of 5 variables: $ Subject : int 2 2 2 2 3 3 3 3 5 5 ... $ Period : int 1 2 3 4 1 2 3 4 1 2 ... $ Treatment: chr "T" "R" "T" "R" ... $ Sequence : chr "TRTR" "TRTR" "TRTR" "TRTR" ... $ PK : num 557 525 119 201 544 ... Analysis of Variance Table Response: log(PK) Df Sum Sq Mean Sq F value Pr(>F) Sequence 1 1.32531015 1.325310155 0.83233 0.36273 Period 1 0.31093557 0.310935573 0.19528 0.65905 Treatment 1 3.41095080 3.410950796 2.14218 0.14492 Sequence:Subject 2 6.70138392 3.350691961 2.10433 0.12470 Residuals 194 308.90305612 1.592283794

Here lm() is left out in the rain because we have only a simple data frame. The degrees of freedom and Mean Squares are wrong. Therefore, we need to factorize effects:

# columns which need to be factorized cols <- c("Subject", "Period", "Sequence", "Treatment") DS_Cmax[cols] <- lapply(DS_Cmax[cols], factor) Fit_A.corr <- lm(log(PK) ~ Sequence + Subject %in% Sequence + Period + Treatment, data = DS_Cmax) TypeI <- anova(Fit_A.corr) str(DS_Cmax) print(TypeI, digits = 6, signif.stars = FALSE) 'data.frame': 200 obs. of 5 variables: $ Subject : Factor w/ 50 levels "2","3","5","8",..: 1 1 1 1 2 2 2 2 3 3 ... $ Period : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ... $ Treatment: Factor w/ 2 levels "R","T": 2 1 2 1 2 1 2 1 2 1 ... $ Sequence : Factor w/ 2 levels "RTRT","TRTR": 2 2 2 2 2 2 2 2 2 2 ... $ PK : num 557 525 119 201 544 ... Analysis of Variance Table Response: log(PK) Df Sum Sq Mean Sq F value Pr(>F) Sequence 1 1.3253 1.32531 2.44067 0.120391 Period 3 3.5824 1.19413 2.19909 0.090673 Treatment 1 2.7923 2.79228 5.14222 0.024818 Sequence:Subject 48 233.6721 4.86817 8.96514 < 2e-16 Residuals 146 79.2796 0.54301

That’s the default, where Sequence is tested against Residuals. Now change to the correct test:

TypeIII <- TypeI # use what we have attr(TypeIII, "heading")[1] <- "Type III Analysis of Variance Table\n" MSdenom <- TypeIII["Sequence:Subject", "Mean Sq"] df2 <- TypeIII["Sequence:Subject", "Df"] fvalue <- TypeIII["Sequence", "Mean Sq"] / MSdenom df1 <- TypeIII["Sequence", "Df"] TypeIII["Sequence", 4] <- fvalue TypeIII["Sequence", 5] <- pf(fvalue, df1, df2, lower.tail = FALSE) print(TypeIII, digits = 6, signif.stars = FALSE) Analysis of Variance Table Response: log(PK) Df Sum Sq Mean Sq F value Pr(>F) Sequence 1 1.3253 1.32531 2.44067 0.120391 Period 3 3.5824 1.19413 2.19909 0.090673 Treatment 1 2.7923 2.79228 5.14222 0.024818 Sequence:Subject 48 233.6721 4.86817 8.96514 < 2e-16 Residuals 146 79.2796 0.54301

Finally we get what replicateBE method.A() provides.

Two more comments:

As shown above, Fit_A <- lm() followed by anova(Fit_A) will give by default (in SAS parlance) Type I estimates. That’s not what we want (see there and how to get a Type III ANOVA in and Phoenix WinNonlin). We changed that in v1.1.0 of replicateBE last June (see here and there).
In a mixed-effects model there are no least squares estimates cause the optimization is done by (restricted) maximum likelihood. IMHO, anova((lme(foo ~ bar)) is bit unfortunate cause it is not an ANOVA (see there).
- When only one fitted model object is present, a data frame with the numerator degrees of freedom, denominator degrees of freedom, F-values, and P-values for Wald tests for the terms in the model (when Terms and L are NULL) …
Furthermore, you get only the fixed effects in the data frame, i.e., not subjects, which are a random effect. Note also that in lme() you get automatically the correct test for sequence.

BTW, the nested model subject(sequence) doesn’t make sense for uniquely coded subjects (see there). If Subject 1 is randomized to sequence TR, there is not ‘another’ Subject 1 randomized to sequence RT. Randomization is not like Schrödinger’s cat. Hence, the nested term in the guidelines is an insult to the mind. That’s more evident if we use aov() instead of anova(), showing in an additional line if effects are not estimable.

nested <- lm(log(PK) ~ Sequence + Subject %in% Sequence + Period + Treatment, data = DS_Cmax) print(aov(nested)) Sequence Period Treatment Sequence:Subject Residuals Sum of Squares 1.32531 3.58239 2.79228 233.67209 79.27956 Deg. of Freedom 1 3 1 48 146 Residual standard error: 0.7368926 50 out of 104 effects not estimable Estimated effects may be unbalanced # not nested simple <- lm(log(PK) ~ Sequence + Subject + Period + Treatment, data = DS_Cmax) print(aov(simple)) Sequence Subject Period Treatment Residuals Sum of Squares 1.32531 233.67209 3.58239 2.79228 79.27956 Deg. of Freedom 1 48 3 1 146 Residual standard error: 0.7368926 1 out of 55 effects not estimable Estimated effects may be unbalanced # without sequence very.simple <- lm(log(PK) ~ Subject + Period + Treatment, data = DS_Cmax) print(aov(very.simple)) Subject Period Treatment Residuals Sum of Squares 234.99740 3.58239 2.79228 79.27956 Deg. of Freedom 49 3 1 146 Residual standard error: 0.7368926 Estimated effects may be unbalanced

In all models (even the one without Sequence) the MSE of Period, and Treatment and the Residual Error are identical. Hence, you will get the same PE and its CI…

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Help for ANOVA from Package replicateBE Weidson 2021-12-30 21:16 [🇷 for BE/BA]
- Help for ANOVA from Package replicateBE Helmut 2021-12-30 21:23
  - Help for ANOVA from Package replicateBE Weidson 2021-12-30 21:43
    - No ANOVA by lme() – lengthy replyHelmut 2021-12-30 23:45
      - SSIII ANOVA in Winnonlin mittyri 2022-01-01 11:50
        
        SSIII ANOVA in Winnonlin Helmut 2022-01-01 21:52
      - No ANOVA by lme() – lengthy reply Weidson 2022-01-02 14:51
        
        No CO, please Helmut 2022-01-02 15:49