Jaimik Patel
☆

India,
2019-10-31 11:19

Posting: # 20737
Views: 458

## Least square mean calculation for the fully replicate design [General Sta­tis­tics]

Dear All,

One question for the least square mean calculation for the fully replicate design as per USFDA in SAS.

We performed the statistical analysis using the code provided in progesterone guideline and Least square mean values of AUCi parameter for the Test and Reference are 6.112006 and 6.111151 respectively.

Unfortunately, after statistical analysis, one error was identified by sponsor in bio-analysis of sample in test formulation. Only one value in particular time point of one period in one subject got changed. Hence, Statistical analysis performed again and it was observed that Least square mean values of AUCi for Test and Reference are 6.111382 and 6.110883. The change in the value of test product results in the change in LSM value in reference product!!

The question is, reference product data remain the same, there is not a single value change in reference data then why Least square mean value of reference product changed AUCi parameter. Why this is affecting the reference data?

I have also perform the same experiments in two way study design but in that if I am changing any reference data it is not impacting on least square test data.

Thanks !!
Jaimik

Helmut
★★★

Vienna, Austria,
2019-10-31 14:36

@ Jaimik Patel
Posting: # 20738
Views: 421

## Very, very strange!

Hi Jaimik,

a very interesting observation!

I could confirm that in Phoenix WinNonlin 8.1 (multiplied the first T value of the EMA’s reference data sets by 10). The geometric least squares mean of R changed in data set I (full replicate) but not in data set II (partial replicate).
I also checked the EMA’s methods (simple ANOVA). Similar.

Respective first rows original data, second ones T changed.

─────────────────────────────────           model        ───────────────────────────────── data set  RSABE      R        T    ───────────────────────────────── I                2144.00  2479.70                     2143.04  2517.74    II               2852.54  2917.13                     2852.54  3074.12    ───────────────────────────────────────────────────                       Method A          Method B  ───────────────────────────────────────────────────           ABEL       R        T        R        T ─────────────────────────────────────────────────── I                2140.84  2476.07  2143.11  2480.22                  2140.18  2513.57  2142.71  2518.22 II               2852.54  2917.13  2852.54  2917.13                  2852.54  3210.87  2852.54  3210.87 ───────────────────────────────────────────────────

Beyond me.

PS: Differences between RSABE and ABEL are to be expected in case of incomplete data. For the FDA incomplete data are dropped but kept for the EMA.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
Shuanghe
★★

Spain,
2019-10-31 19:43

@ Helmut
Posting: # 20739
Views: 390

## Very, very strange!

Hi Helmut,

» I could confirm that in Phoenix WinNonlin 8.1 (multiplied the first T value of the EMA’s reference data sets by 10)...

» I 2144.00 2479.70
» 2143.04 2517.74

Very strange in deed. I just got the same result in SAS (2144--> 2143.04). I only have time to test the data set 1. I guess it's probably the RTFM time for us. Maybe Detlew has some insight. I'll check the SAS manual after his comment

All the best,
Shuanghe
Helmut
★★★

Vienna, Austria,
2019-10-31 19:50

@ Shuanghe
Posting: # 20742
Views: 387

## Very, very strange!

HI Shuanghe,

» I guess it's probably the RTFM time for us. Maybe Detlew has some insight. I'll check the SAS manual after his comment

Don’t expect anything soon. He is on vacation till mid-November.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
Jaimik Patel
☆

India,
2019-11-02 12:26

@ Helmut
Posting: # 20746
Views: 262

## Very, very strange!

Dear All,

Is there any role of G-matrix in the calculation of LSM?

we have observed that value of G-matrix is different in both cases.

Regards,
Jaimik

mittyri
★★

Russia,
2019-11-03 01:38

@ Jaimik Patel
Posting: # 20748
Views: 214

## G matrix

Dear Patel,

» Is there any role of G-matrix in the calculation of LSM?

Even with one sample modified the model coefficients are changed. G matrix specifies variance-covariance matrix G and it is used to specify subject-specific effects.
With model changed covariance matrix is also changed, thus, you see the difference. Since MIXED statement uses some algos to achieve convergence, some changes in points could be crucial even if they are insignificant for reviewer.

Kind regards,
Mittyri
ElMaestro
★★★

Belgium?,
2019-10-31 19:48

@ Jaimik Patel
Posting: # 20740
Views: 390

## Inner workings of REML

Hi Jaimik,

» The change in the value of test product results in the change in LSM value in reference product!!

I think, without being absolutely certain....:

When a mixed model is fitted with ML, you have a straightforward way of estimating the fixed effects. You need only to iteratively estimate the variance components, and the fixed effects are simply estimated the same fashion as in a linear (non-mixed) model - deterministically.

When a model is fitted using REML, which is what happens when you do studies for FDA, then the likelihood depends both on the variance components and on the model effects, but what's worse, the model effects depend on the variance components, so you are not only iterating across the sigmas to find the likelihood optimum, but also need to optimise within the vector of fixed effects. So you can start out with estimates of both the variance components and the fixed effects. Then you first optimise the variance components "a little". Then you optimise the fixed effects "a little". And then you repeat the cycle. That is a safe way of arriving at the optimised REML solution, but I will not in any way say it is the only or the best - I simply do not know enough about matrix likelihood to state anything in this regard.

I imagine that you might not see this phenomenon if you pick ML in your mixmo (which is not what FDA want) in stead of REML. Can you try and test it?

I could be wrong, but...
Best regards,
ElMaestro
Helmut
★★★

Vienna, Austria,
2019-10-31 19:53

@ ElMaestro
Posting: # 20743
Views: 380

## Even in ANOVA…

Ahoy ElMaestro,

» When a mixed model is fitted with ML, [etc. etc.]

That was my first idea as well. But we see a similar effect with the EMA’s Method A above, which is a bloody ANOVA and all effects fixed. I don’t get it.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
zizou
★

Plzeň, Czech Republic,
2019-11-01 21:05

@ Helmut
Posting: # 20744
Views: 295

## Even in 2x2...

» ... But we see a similar effect with the EMA’s Method A above, which is a bloody ANOVA and all effects fixed. I don’t get it.

Dear all,

the reason of the difference is that sequences were unbalanced (not equal number of subjects in each of the sequences).
As you know (for ln-transformed data) the estimated marginal means (i.e. least squares means in SAS terminology) differ from arithmetic means of T and R if sequences are unbalanced.
The way of calculation of the estimated marginal means involves some modification of "standard" marginal means. I'm lazy to go through the matrix algebra (which is used by most of the softwares), so simply: marginal means are "corrected" to estimated marginal means by using a difference between mean of all data and mean of marginal means of T and R (when the sequences are balanced then this difference is zero).

Personally I would not report these least squares means at all. If you are calculating ANOVA (e.g. by GLM) you get point estimate directly, i.e. without calculation of least squares means.

For unbalanced sequences there might be more questions ... together with misleading terminology - e.g. geometric means ratio - for unbalanced sequences, the ratio of geometric means reported in descriptive statistics differ from geometric means ratio reported with 90% confidence interval which is also very, very strange. As these ratios are different, somoone wants to have the "correct" geometric means of T and R for which the ratio T/R is equal to point estimate (i.e. somoone wants "geometric least squares means"). Nevertheless geometric means are geometric means! Behind geometric least squares means I see only values of "T" and "R", for which (by dividing) we get the point estimate. But as it was pointed out, least squares mean of T can be affected by R, and vice versa.

Best regards,
zizou
mittyri
★★

Russia,
2019-11-02 02:26

@ zizou
Posting: # 20745
Views: 281

## trying to understand emmeans

Dear Zizou. dear All,

» the reason of the difference is that sequences were unbalanced (not equal number of subjects in each of the sequences).

looks like that reason is not the dominant reason.

» For unbalanced sequences there might be more questions ...

when the party comes to incomplete cases, everyone has more fun!

I am lazy man too, but some code for your pleasure

library(dplyr) library(nlme) library(replicateBE) library(emmeans) messupdataset<- function(dataset, treatment = 'R'){   dataset$logPK <- ifelse(dataset$treatment==treatment, dataset$logPK*runif(1),dataset$logPK)   return(dataset) } emmeanscomparison <- function(dataset, messuprefdata = F){   dataset$logPK <- log(dataset$PK)   if(messuprefdata==T){     dataset <- messupdataset(dataset, "R")   }      M=lm(logPK ~ sequence + subject + period + treatment, data = dataset)   cat(paste0("\nLSMeans ratio by lm(): \nT/R = ", exp(coef(M)[["treatmentT"]])))     newdat <- expand.grid(treatment = levels(dataset$treatment), sequence = levels(dataset$sequence),                         subject = levels(dataset$subject), period = levels(dataset$period))   preddata <- cbind(newdat, predict(M, newdat))   preddatawoNA <- na.omit(left_join(preddata, dataset, by = c("treatment", "sequence", "subject", "period")))   lsmeansbyhand <-     preddatawoNA%>%     group_by(treatment, sequence, period) %>%     summarize(subjectmean = mean(predict(M, newdat))) %>%     group_by(treatment) %>%     summarize(lsmean = exp(mean(subjectmean))) %>%     mutate(lsmeansratio = lsmean[treatment=='T']/lsmean[treatment=='R']) %>%     as.data.frame()     cat(paste0("\nLSMeans by hand: \nT = ", lsmeansbyhand$lsmean[lsmeansbyhand$treatment=='T']))   cat(paste0("\nLSMeans by hand: \nR = ", lsmeansbyhand$lsmean[lsmeansbyhand$treatment=='R']))   cat(paste0("\nLSMeans ratio by hand: \nT/R = ", lsmeansbyhand$lsmeansratio[1], "\n")) emmeansdf <- data.frame(emmeans(M, 'treatment')) cat(paste0("\nLSMeans by emmeans: \nT = ", exp(emmeansdf$emmean[emmeansdf$treatment=='T']))) cat(paste0("\nLSMeans by emmeans: \nR = ", exp(emmeansdf$emmean[emmeansdf$treatment=='R']))) cat(paste0("\nLSMeans ratio by emmeans: \nT/R = ", exp(emmeansdf[emmeansdf$treatment=='T',]$emmean - emmeansdf[emmeansdf$treatment=='R',]\$emmean), "\n")) } # balanced complete emmeanscomparison(rds11, messuprefdata = F) seed = 123 emmeanscomparison(rds11, messuprefdata = T) # unbalanced complete emmeanscomparison(rds16, messuprefdata = F) seed = 123 emmeanscomparison(rds16, messuprefdata = T) # balanced incompete emmeanscomparison(rds26, messuprefdata = F) seed = 123 emmeanscomparison(rds26, messuprefdata = T) # unbalanced incompete emmeanscomparison(rds14, messuprefdata = F) seed = 123 emmeanscomparison(rds14, messuprefdata = T)

so I can get the same results for complete cases, but not for INcomplete. Please also take a look at the lsmeans for T when R is messed up for unbalanced complete and unbalanced incompltete. Looks like competeness has a predominant value
so my understanding of emmeans is also INcomplete and Unbalanced

Kind regards,
Mittyri
PharmCat
☆

Russia,
2019-11-03 00:46
(edited by mittyri on 2019-11-03 01:44)

@ Jaimik Patel
Posting: # 20747
Views: 220

## Least square mean calculation for the fully replicate design

Hi Jaimik, Hi all!

» One question for the least square mean calculation for the fully replicate design as per USFDA in SAS.
» ....

I think when changes for one value have been done - it was not changes only for formulation, it was changes for sequence and period also. And if you look at model coefficients probably you will find changes in sequence coefficient. estimate is calculated as L*β where L is a vector of known constants. For example if we have 2 sequence, 2 period, 2 formulation, length of β vector is 4 and for one formulation L = [1; 1/2; 1/2; 0] for other L = [1; 1/2; 1/2; 1]. When value changed its lead to changes in sequence part of β, and then to marginal value of each formulation.

I imagine it like this: one part of change is go to current formulation mean value, and some part goes to sequence and period (because it one model), and because sequence is crossed with other formulation it affect on other formulation level.

Bioequivalence and Bioavailability Forum |  Admin contact
19,963 posts in 4,226 threads, 1,373 registered users;
online 11 (1 registered, 10 guests [including 9 identified bots]).
Forum time (Europe/Vienna): 21:06 CET

There are no dangerous thoughts;
thinking itself is dangerous.    Hannah Arendt

The BIOEQUIVALENCE / BIOAVAILABILITY FORUM is hosted by
Ing. Helmut Schütz