## trying to understand emmeans [General Sta­tis­tics]

Dear Zizou. dear All,

» the reason of the difference is that sequences were unbalanced (not equal number of subjects in each of the sequences).

looks like that reason is not the dominant reason.

» For unbalanced sequences there might be more questions ...

when the party comes to incomplete cases, everyone has more fun!

I am lazy man too, but some code for your pleasure

library(dplyr) library(nlme) library(replicateBE) library(emmeans) messupdataset<- function(dataset, treatment = 'R'){   dataset$logPK <- ifelse(dataset$treatment==treatment, dataset$logPK*runif(1),dataset$logPK)   return(dataset) } emmeanscomparison <- function(dataset, messuprefdata = F){   dataset$logPK <- log(dataset$PK)   if(messuprefdata==T){     dataset <- messupdataset(dataset, "R")   }      M=lm(logPK ~ sequence + subject + period + treatment, data = dataset)   cat(paste0("\nLSMeans ratio by lm(): \nT/R = ", exp(coef(M)[["treatmentT"]])))     newdat <- expand.grid(treatment = levels(dataset$treatment), sequence = levels(dataset$sequence),                         subject = levels(dataset$subject), period = levels(dataset$period))   preddata <- cbind(newdat, predict(M, newdat))   preddatawoNA <- na.omit(left_join(preddata, dataset, by = c("treatment", "sequence", "subject", "period")))   lsmeansbyhand <-     preddatawoNA%>%     group_by(treatment, sequence, period) %>%     summarize(subjectmean = mean(predict(M, newdat))) %>%     group_by(treatment) %>%     summarize(lsmean = exp(mean(subjectmean))) %>%     mutate(lsmeansratio = lsmean[treatment=='T']/lsmean[treatment=='R']) %>%     as.data.frame()     cat(paste0("\nLSMeans by hand: \nT = ", lsmeansbyhand$lsmean[lsmeansbyhand$treatment=='T']))   cat(paste0("\nLSMeans by hand: \nR = ", lsmeansbyhand$lsmean[lsmeansbyhand$treatment=='R']))   cat(paste0("\nLSMeans ratio by hand: \nT/R = ", lsmeansbyhand$lsmeansratio[1], "\n")) emmeansdf <- data.frame(emmeans(M, 'treatment')) cat(paste0("\nLSMeans by emmeans: \nT = ", exp(emmeansdf$emmean[emmeansdf$treatment=='T']))) cat(paste0("\nLSMeans by emmeans: \nR = ", exp(emmeansdf$emmean[emmeansdf$treatment=='R']))) cat(paste0("\nLSMeans ratio by emmeans: \nT/R = ", exp(emmeansdf[emmeansdf$treatment=='T',]$emmean - emmeansdf[emmeansdf$treatment=='R',]\$emmean), "\n")) } # balanced complete emmeanscomparison(rds11, messuprefdata = F) seed = 123 emmeanscomparison(rds11, messuprefdata = T) # unbalanced complete emmeanscomparison(rds16, messuprefdata = F) seed = 123 emmeanscomparison(rds16, messuprefdata = T) # balanced incompete emmeanscomparison(rds26, messuprefdata = F) seed = 123 emmeanscomparison(rds26, messuprefdata = T) # unbalanced incompete emmeanscomparison(rds14, messuprefdata = F) seed = 123 emmeanscomparison(rds14, messuprefdata = T)

so I can get the same results for complete cases, but not for INcomplete. Please also take a look at the lsmeans for T when R is messed up for unbalanced complete and unbalanced incompltete. Looks like competeness has a predominant value
so my understanding of emmeans is also INcomplete and Unbalanced

Kind regards,
Mittyri