Bioequivalence and Bioavailability Forum

Jaimik Patel
☆

India,
2019-10-31 12:19
(2072 d 20:12 ago)

Posting: # 20737
Views: 7,613

Least square mean calculation for the fully replicate design [General Statistics]

Dear All,

One question for the least square mean calculation for the fully replicate design as per USFDA in SAS.

We performed the statistical analysis using the code provided in progesterone guideline and Least square mean values of AUCi parameter for the Test and Reference are 6.112006 and 6.111151 respectively.

Unfortunately, after statistical analysis, one error was identified by sponsor in bio-analysis of sample in test formulation. Only one value in particular time point of one period in one subject got changed. Hence, Statistical analysis performed again and it was observed that Least square mean values of AUCi for Test and Reference are 6.111382 and 6.110883. The change in the value of test product results in the change in LSM value in reference product!!

The question is, reference product data remain the same, there is not a single value change in reference data then why Least square mean value of reference product changed AUCi parameter. Why this is affecting the reference data? :confused:

I have also perform the same experiments in two way study design but in that if I am changing any reference data it is not impacting on least square test data.

Please share your thoughts … :-)

Thanks !!
Jaimik

Edit: Category changed; see also this post #1. [Helmut]

Helmut
★★★

Vienna, Austria,
2019-10-31 15:36
(2072 d 16:55 ago)

@ Jaimik Patel
Posting: # 20738
Views: 6,767

Very, very strange!

Post reply

Hi Jaimik,

a very interesting observation!

I could confirm that in Phoenix WinNonlin 8.1 (multiplied the first T value of the EMA’s reference data sets by 10). The geometric least squares mean of R changed in data set I (full replicate) but not in data set II (partial replicate).
I also checked the EMA’s methods (simple ANOVA). Similar.

Respective first rows original data, second ones T changed.

───────────────────────────────── model ───────────────────────────────── data set RSABE R T ───────────────────────────────── I 2144.00 2479.70 2143.04 2517.74 II 2852.54 2917.13 2852.54 3074.12 ─────────────────────────────────────────────────── Method A Method B ─────────────────────────────────────────────────── ABEL R T R T ─────────────────────────────────────────────────── I 2140.84 2476.07 2143.11 2480.22 2140.18 2513.57 2142.71 2518.22 II 2852.54 2917.13 2852.54 2917.13 2852.54 3210.87 2852.54 3210.87 ───────────────────────────────────────────────────

Beyond me. :confused:

PS: Differences between RSABE and ABEL are to be expected in case of incomplete data. For the FDA incomplete data are dropped but kept for the EMA.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Shuanghe
★★

Spain,
2019-10-31 20:43
(2072 d 11:48 ago)

@ Helmut
Posting: # 20739
Views: 6,584

Very, very strange!

Post reply

Hi Helmut,

❝ I could confirm that in Phoenix WinNonlin 8.1 (multiplied the first T value of the EMA’s reference data sets by 10)...

❝ I 2144.00 2479.70

❝ 2143.04 2517.74

Very strange in deed. I just got the same result in SAS (2144--> 2143.04). I only have time to test the data set 1. I guess it's probably the RTFM time for us. Maybe Detlew has some insight. I'll check the SAS manual after his comment :-D

—
All the best,
Shuanghe

Helmut
★★★

Vienna, Austria,
2019-10-31 20:50
(2072 d 11:41 ago)

@ Shuanghe
Posting: # 20742
Views: 6,655

Very, very strange!

Post reply

HI Shuanghe,

❝ I guess it's probably the RTFM time for us. Maybe Detlew has some insight. I'll check the SAS manual after his comment :-D

Don’t expect anything soon. He is on vacation till mid-November.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Jaimik Patel ☆ India, 2019-11-02 13:26 (2070 d 19:05 ago) @ Helmut Posting: # 20746 Views: 6,486	Very, very strange! Post reply
	Dear All, Thanks for your valuable inputs. Is there any role of G-matrix in the calculation of LSM? we have observed that value of G-matrix is different in both cases. Regards, Jaimik Edit: Full quote removed. Please delete everything from the text of the original poster which is not necessary in understanding your answer; see also this post #5! [Helmut]

mittyri
★★

Russia,
2019-11-03 02:38
(2070 d 05:53 ago)

@ Jaimik Patel
Posting: # 20748
Views: 6,449

G matrix

Post reply

Dear Patel,

❝ Is there any role of G-matrix in the calculation of LSM?

Even with one sample modified the model coefficients are changed. G matrix specifies variance-covariance matrix G and it is used to specify subject-specific effects.
With model changed covariance matrix is also changed, thus, you see the difference. Since MIXED statement uses some algos to achieve convergence, some changes in points could be crucial even if they are insignificant for reviewer.

—
Kind regards,
Mittyri

ElMaestro
★★★

Denmark,
2019-10-31 20:48
(2072 d 11:43 ago)

@ Jaimik Patel
Posting: # 20740
Views: 6,612

Inner workings of REML

Post reply

Hi Jaimik,

❝ The change in the value of test product results in the change in LSM value in reference product!!

I think, without being absolutely certain....:

When a mixed model is fitted with ML, you have a straightforward way of estimating the fixed effects. You need only to iteratively estimate the variance components, and the fixed effects are simply estimated the same fashion as in a linear (non-mixed) model - deterministically.

When a model is fitted using REML, which is what happens when you do studies for FDA, then the likelihood depends both on the variance components and on the model effects, but what's worse, the model effects depend on the variance components, so you are not only iterating across the sigmas to find the likelihood optimum, but also need to optimise within the vector of fixed effects. So you can start out with estimates of both the variance components and the fixed effects. Then you first optimise the variance components "a little". Then you optimise the fixed effects "a little". And then you repeat the cycle. That is a safe way of arriving at the optimised REML solution, but I will not in any way say it is the only or the best - I simply do not know enough about matrix likelihood to state anything in this regard.

I imagine that you might not see this phenomenon if you pick ML in your mixmo (which is not what FDA want) in stead of REML. Can you try and test it?

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2019-10-31 20:53
(2072 d 11:39 ago)

@ ElMaestro
Posting: # 20743
Views: 6,674

Even in ANOVA…

Post reply

Ahoy ElMaestro,

❝ When a mixed model is fitted with ML, [etc. etc.]

That was my first idea as well. But we see a similar effect with the EMA’s Method A above, which is a bloody ANOVA and all effects fixed. I don’t get it.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

zizou
★

Plzeň, Czech Republic,
2019-11-01 22:05
(2071 d 10:26 ago)

@ Helmut
Posting: # 20744
Views: 6,572

Even in 2x2...

Post reply

❝ ... But we see a similar effect with the EMA’s Method A above, which is a bloody ANOVA and all effects fixed. I don’t get it.

Dear all,

the reason of the difference is that sequences were unbalanced (not equal number of subjects in each of the sequences).
As you know (for ln-transformed data) the estimated marginal means (i.e. least squares means in SAS terminology) differ from arithmetic means of T and R if sequences are unbalanced.
The way of calculation of the estimated marginal means involves some modification of "standard" marginal means. I'm lazy to go through the matrix algebra (which is used by most of the softwares), so simply: marginal means are "corrected" to estimated marginal means by using a difference between mean of all data and mean of marginal means of T and R (when the sequences are balanced then this difference is zero).

Personally I would not report these least squares means at all. If you are calculating ANOVA (e.g. by GLM) you get point estimate directly, i.e. without calculation of least squares means.

For unbalanced sequences there might be more questions ... together with misleading terminology - e.g. geometric means ratio - for unbalanced sequences, the ratio of geometric means reported in descriptive statistics differ from geometric means ratio reported with 90% confidence interval which is also very, very strange. As these ratios are different, somoone wants to have the "correct" geometric means of T and R for which the ratio T/R is equal to point estimate (i.e. somoone wants "geometric least squares means"). Nevertheless geometric means are geometric means! Behind geometric least squares means I see only values of "T" and "R", for which (by dividing) we get the point estimate. But as it was pointed out, least squares mean of T can be affected by R, and vice versa.

Best regards,
zizou

mittyri
★★

Russia,
2019-11-02 03:26
(2071 d 05:06 ago)

@ zizou
Posting: # 20745
Views: 6,582

trying to understand emmeans

Post reply

Dear Zizou. dear All,

❝ the reason of the difference is that sequences were unbalanced (not equal number of subjects in each of the sequences).

looks like that reason is not the dominant reason.

❝ For unbalanced sequences there might be more questions ...

when the party comes to incomplete cases, everyone has more fun!

I am lazy man too, but some code for your pleasure

library(dplyr)

library(nlme)

library(replicateBE)

library(emmeans)

messupdataset<- function(dataset, treatment = 'R'){

  dataset$logPK <- ifelse(dataset$treatment==treatment, dataset$logPK*runif(1),dataset$logPK) 

  return(dataset)

}



emmeanscomparison <- function(dataset, messuprefdata = F){

  dataset$logPK <- log(dataset$PK)

  if(messuprefdata==T){

    dataset <- messupdataset(dataset, "R")

  }  

  

  M=lm(logPK ~ sequence + subject + period + treatment, data = dataset)

  cat(paste0("\nLSMeans ratio by lm(): \nT/R = ", exp(coef(M)[["treatmentT"]])))

  

  newdat <- expand.grid(treatment = levels(dataset$treatment), 

                        sequence = levels(dataset$sequence), 

                        subject = levels(dataset$subject), 

                        period = levels(dataset$period))

  preddata <- cbind(newdat, predict(M, newdat))

  preddatawoNA <- na.omit(left_join(preddata, dataset, by = c("treatment", "sequence", "subject", "period")))

  lsmeansbyhand <-

    preddatawoNA%>%

    group_by(treatment, sequence, period) %>%

    summarize(subjectmean = mean(`predict(M, newdat)`)) %>%

    group_by(treatment) %>%

    summarize(lsmean = exp(mean(subjectmean))) %>%

    mutate(lsmeansratio = lsmean[treatment=='T']/lsmean[treatment=='R']) %>%

    as.data.frame()

  

  cat(paste0("\nLSMeans by hand: \nT = ", lsmeansbyhand$lsmean[lsmeansbyhand$treatment=='T']))

  cat(paste0("\nLSMeans by hand: \nR = ", lsmeansbyhand$lsmean[lsmeansbyhand$treatment=='R']))

  cat(paste0("\nLSMeans ratio by hand: \nT/R = ", lsmeansbyhand$lsmeansratio[1], "\n"))

  

  emmeansdf <- data.frame(emmeans(M, 'treatment'))

  cat(paste0("\nLSMeans by emmeans: \nT = ", exp(emmeansdf$emmean[emmeansdf$treatment=='T'])))

  cat(paste0("\nLSMeans by emmeans: \nR = ", exp(emmeansdf$emmean[emmeansdf$treatment=='R'])))

  cat(paste0("\nLSMeans ratio by emmeans: \nT/R = ", exp(emmeansdf[emmeansdf$treatment=='T',]$emmean - emmeansdf[emmeansdf$treatment=='R',]$emmean), "\n"))

}



# balanced complete

emmeanscomparison(rds11, messuprefdata = F)

seed = 123

emmeanscomparison(rds11, messuprefdata = T)



# unbalanced complete

emmeanscomparison(rds16, messuprefdata = F)

seed = 123

emmeanscomparison(rds16, messuprefdata = T)



# balanced incompete

emmeanscomparison(rds26, messuprefdata = F)

seed = 123

emmeanscomparison(rds26, messuprefdata = T)



# unbalanced incompete

emmeanscomparison(rds14, messuprefdata = F)

seed = 123

emmeanscomparison(rds14, messuprefdata = T)

so I can get the same results for complete cases, but not for INcomplete. Please also take a look at the lsmeans for T when R is messed up for unbalanced complete and unbalanced incompltete. Looks like competeness has a predominant value
so my understanding of emmeans is also INcomplete and Unbalanced :-D

—
Kind regards,
Mittyri

PharmCat
★

Russia,
2019-11-03 01:46
(2070 d 06:45 ago)

@ Jaimik Patel
Posting: # 20747
Views: 6,483

Least square mean calculation for the fully replicate design

Post reply

Hi Jaimik, Hi all!

❝ One question for the least square mean calculation for the fully replicate design as per USFDA in SAS.

❝ ....

❝ Please share your thoughts … :-)

I think when changes for one value have been done - it was not changes only for formulation, it was changes for sequence and period also. And if you look at model coefficients probably you will find changes in sequence coefficient. estimate is calculated as L*β where L is a vector of known constants. For example if we have 2 sequence, 2 period, 2 formulation, length of β vector is 4 and for one formulation L = [1; 1/2; 1/2; 0] for other L = [1; 1/2; 1/2; 1]. When value changed its lead to changes in sequence part of β, and then to marginal value of each formulation.

I imagine it like this: one part of change is go to current formulation mean value, and some part goes to sequence and period (because it one model), and because sequence is crossed with other formulation it affect on other formulation level.

Edit: Unnecessary quote removed. Please delete everything from the text of the original poster which is not necessary in understanding your answer; see also this post #5! [Mittyri]

Least square mean calculation for the fully replicate design [General Sta­tis­tics]

Very, very strange!

Very, very strange!

Very, very strange!

Very, very strange!

G matrix

Inner workings of REML

Even in ANOVA…

Even in 2x2...

trying to understand emmeans

Least square mean calculation for the fully replicate design

Least square mean calculation for the fully replicate design [General Statistics]