Bioequivalence and Bioavailability Forum • Imbalance + Type III SS = Tricky for the sequence evaluation

ElMaestro
★★★

Denmark,
2009-09-07 23:16
(5762 d 16:15 ago)

Posting: # 4161
Views: 10,591

Imbalance + Type III SS = Tricky for the sequence evaluation [🇷 for BE/BA]

Dear all,

Following some previous discussion...

drop1 in R gives a SS of roughly zero (plus minus something smaller than the internal convergence criterion for lm) for the sequence effect.
When we have sequence imbalance, type I etc will not save us when we want an output that resembles the type III output from SAS*. If we really want to get a meaningful evaluation of the sequence effect and still use type III SS then we need to play around a little bit. The following is an elmaestrolophystic attempt at getting it right. drop1 for Sequence makes little sense in itself because of the Subject factor. A meaningful type III SS for Sequence therefore is evaluated by comparing the residual from a model with Per and Trt with the residual from a model with Per and Trt and Seq.

So here's my ugly proposal:
Lm1=lm(lnAuc~Per+Trt+Subj+Seq) ## standard model, right?
T3A=drop1(Lm1, test="F") ## this is our type III anova which gives a dumb Seq SS

T3A[5,2] = anova(lm(lnAuc~Per+Trt))$Sum[3] - anova(lm(lnAuc~Per+Trt+Seq))$Sum[4]
##

manually corrects the Seq SS accordingly to the text above

The rest is then plain sailing with conversion of the newly acquired Seq SS to the mean square, followed by evaluation against the between-Subj error.

Best regards
EM.

*: This is just a reflection over the fact that some people want to be able to reproduce the SAS type III output. I am not claiming type III SS are superior; that aspect is better dealt with by others.

yjlee168
★★★

Kaohsiung, Taiwan,
2009-09-08 00:23
(5762 d 15:09 ago)

@ ElMaestro
Posting: # 4162
Views: 8,815

Imbalance + Type III SS = Tricky for the sequence evaluation

Post reply

Dear Elmaestro,

I've tested your codes and got the following.

  Dependent Variable: Cmax                                                 



Type I SS

Analysis of Variance Table



Response: Cmax

          Df Sum Sq Mean Sq F value Pr(>F)

prd        1  37889   37889  0.8844 0.3656

drug       1  88706   88706  2.0705 0.1757

subj(seq) 13 720367   55413  1.2934 0.3313

Residuals 12 514123   42844               



Type III SS

Single term deletions



Model:

Cmax ~ prd + drug + subj + seq

          Df Sum of Sq     RSS     AIC F value  Pr(F)

<none>                  514123     307               

prd        1     37889  552013     307  0.8844 0.3656

drug       1     88706  602830     309  2.0705 0.1757

subj(seq) 12    719310 1233434     307  1.3991 0.2849

seq        0      1057  514123     307               



Tests of Hypothesis for SUBJECT(SEQUENCE) as an error term



Error: subj

          Df Sum Sq Mean Sq F value Pr(>F)

prd:drug   1   1057    1057  0.0176 0.8966

Residuals 12 719310   59943               



Error: Within

          Df Sum Sq Mean Sq F value Pr(>F)

prd        1  37889   37889  0.8844 0.3656

drug       1  88706   88706  2.0705 0.1757

Residuals 12 514123   42844

Find anything different from previous thread. Something can be wrong there... this is just a quick response.

❝ So here's my ugly proposal:

❝ Lm1=lm(lnAuc~Per+Trt+Subj+Seq) ## standard model, right?

❝ T3A=drop1(Lm1, test="F") ## this is our type III anova which gives a dumb

❝ Seq SS

❝ T3A[5,2] = anova(lm(lnAuc~Per+Trt))$Sum[3] -

❝ anova(lm(lnAuc~Per+Trt+Seq))$Sum[4]

❝ ## manually corrects the Seq SS accordingly to the text above

I translated your codes as follows.

cat("  Dependent Variable: Cmax                                                 \\n")

cat("\\n")

cat("Type I SS\\n")

Cmax<- lm(Cmax ~ prd + drug + subj + seq, data=TotalData)

BearAnova=anova(Cmax)

row.names(BearAnova)[3]="subj(seq)"

show(BearAnova)

cat("\\n")

cat("Type III SS\\n")

Cmax_drop1 <- drop1(Cmax, test="F")

row.names(Cmax_drop1)[4]="subj(seq)"

Cmax_drop1[5,2]= anova(lm(Cmax ~ prd + drug, data=TotalData))$Sum[3] - 

  anova(lm(Cmax ~ prd + drug + seq, data=TotalData))$Sum[4] 

show(Cmax_drop1)

cat("\\n")

Am I doing anything stupid here? Thanks.

—
All the best,
-- Yung-jin Lee
bear v2.9.2:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan https://www.pkpd168.com/bear
Download link (updated) -> here

ElMaestro ★★★ Denmark, 2009-09-08 18:49 (5761 d 20:43 ago) @ yjlee168 Posting: # 4170 Views: 8,721	Imbalance + Type III SS = Tricky for the sequence evaluation Post reply
	Dear yjlee, ❝ Am I doing anything stupid here? Thanks. I am sure you are not. But I lost it slightly here, could you explain me what your concern is? Thanks and best regards, EM.

yjlee168
★★★

Kaohsiung, Taiwan,
2009-09-08 22:02
(5761 d 17:29 ago)

@ ElMaestro
Posting: # 4171
Views: 8,801

Imbalance + Type III SS = Tricky for the sequence evaluation

Post reply

Dear Elmaestro,

Firstly, if you look at this previous post and compare with what I got here (part of type I SS), you will find out that R does different calculations with the model of (Cmax ~ seq + prd + drug + subj) and the one (Cmax ~ prd + drug + subj + seq) for type I SS. Apparently, the list "sequence" of fixed variables can result in differences for type I SS. The seq was disappeared! Amazing thing, another finding in lm() of R. Secondly, due to the different type I SS calculation, the anova() function had different "sequence" list of these fixed variables. That's why I took a lot of time to locate "subj(seq)". That's weird. I checked with R on-line help (chm), and it said that

"quoted...The models fit by, e.g., the lm and glm functions are specified in a compact symbolic form. The ~ operator is basic in the formation of such models. An expression of the form y ~ model is interpreted as a specification that the response y is modelled by a linear predictor specified symbolically by model. Such a model consists of a series of terms separated by + operators. The terms themselves consist of variable and factor names separated by : operators. Such a term is interpreted as the interaction of all the variables and factors appearing in the term..."

in "formula". And also

"quote...Models for lm are specified symbolically. A typical model has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first + second indicates all the terms in first together with all the terms in second with duplicates removed. A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second..."

in the section of Details of lm(stats). In the model that you proposed, it was written as (Cmax ~ prd + drug + subj + seq). R automatically drops the variable (or factor) seq out of its included with this model in calculation type I SS, but not with the previous of (Cmax ~ seq + prd + drug + subj) or others. I'm playing with lm() right now with different list sequences of fixed variables to see what I can get. Interesting, uh?

❝ I am sure you are not. But I lost it slightly here, could you explain me

❝ what your concern is?

—
All the best,
-- Yung-jin Lee
bear v2.9.2:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan https://www.pkpd168.com/bear
Download link (updated) -> here

ElMaestro
★★★

Denmark,
2009-09-08 22:28
(5761 d 17:03 ago)

@ yjlee168
Posting: # 4172
Views: 8,819

Sequential phenomena and some R code

Post reply

Dear yjlee,

❝ Apparently, the list "sequence" of fixed variables can result in differences for type I SS. The seq was disappeared! Amazing thing, another finding in lm() of R.

R's defualt method of doing the anova in conjunction with the model is the tyype I approach, which means it fits the factors one-on-top-of-the-other.
so when you have a model like lm(Y~A+B+C ... etc) then R first notes the total variance in the raw data. This is the null residual. Then it fits a model with only A as factor, and notes the new residual. The difference between the null residual and the new residual is the ascribed to factor A. Then it fits a model with A and B as factors and notes the new residual. The difference between this and the previous is ascribed to factor B, and so forth. This is what type I SS is about. They are also called sequential, because the magnitude of the SS for a given factor may depend on the order of which it is mincluded in the model, so and anova on lm(Y~A+B+C) may not be the same as the anova on lm(Y~C+A+B).

❝ R automatically drops the variable (or factor) seq out of its included with this model in calculation type I SS, but not with the previous of (Cmax ~ seq + prd + drug + subj) or others. I'm playing with lm() right now with different list sequences of fixed variables to see what I can get. Interesting, uh?

Given the nature of type I SS you will see that when Seq is included after Subj there is no (addition to) the SS, because of the good old "Subjects nested in Sequence". Thus the neglect. Hence you can try and include Seq before Subj - and then suddenly you have a Seq SS using type I SS. Think about it, it actually makes sense.

The following elmaestrolophystic (probably bugged!) code illustrates it.

Subj=as.factor(c(1,2,3,4,5,6,7,1,2,3,4,5,6,7))

Seq=as.factor (c(1,1,1,1,2,2,2,1,1,1,1,2,2,2))

Per=as.factor (c(1,1,1,1,1,1,1,2,2,2,2,2,2,2))

Trt=as.factor (c(1,1,1,1,2,2,2,2,2,2,2,1,1,1))

lnAuc= c(10,11,12,8,7,8,9,10,11,12,9,10,12,10)

## a VERY small BE study!

Lm1=lm(lnAuc~Per+Trt+Subj+Seq)

Lm2=lm(lnAuc~Per+Trt+Seq+Subj)

anova(Lm1)

anova(Lm2)

## no Seq SS because Seq comes after Subj for Lm1

## all of a sudden there is a Seq SS for Lm2 - note the actual value!



## but we want type III SS because we worship the holy power!

## and since this is an unbalanced study type I is not same as type III SS

## therefore we try drop1

drop1(Lm1, test="F")

## ouch! The Seq SS is zero because when Seq is dropped Subj is already in

## so dropping Seq is the same as dropping nothing

## therefore zero SS for Seq

## panic panic! 

## Solution: We make a drop of Seq from a model which does not include Subj

## from this we extract the Seq SS 

T3A=drop1(Lm1, test="F")

T3A[5,2] = anova(lm(lnAuc~Per+Trt+Seq))$Sum[3]
## This is a smart way of doing it - it works because of the type I sequence!

T3A

## and there we go!

—
Pass or fail!
ElMaestro

yjlee168
★★★

Kaohsiung, Taiwan,
2009-09-08 23:36
(5761 d 15:55 ago)

@ ElMaestro
Posting: # 4173
Views: 8,846

Sequential phenomena and some R code

Post reply

Dear Elmaestro,

❝ [...]

❝ Given the nature of type I SS you will see that when Seq is included after Subj there is no (addition to) the SS, because of the good old "Subjects nested in Sequence". Thus the neglect. Hence you can try and include Seq before Subj - and then suddenly you have a Seq SS using type I SS. Think about it, it actually makes sense.

O.k., does this sequential phenomenon occur with SAS for type I SS calculation? The suggested SAS codes for 2x2x2 crossover is

PROC GLM DATA = KK;

CLASS SUBJ SEQ PER TRT;

MODEL LAUCT LAUCI LCMAX = SEQ SUBJ(SEQ) PER TRT; <--

TEST H = SEQ E=SUBJ(SEQ)/HTYPE=3 ETYPE=3;

ESTIMATE “A vs. B” TRT 1–1;

LSMEAN TRT;

RUN;

Looks like that we need to put seq before subj...

❝ The following elmaestrolophystic (probably bugged!) code illustrates it.

The codes works great and are quite illustrating. many thanks. :clap:

—
All the best,
-- Yung-jin Lee
bear v2.9.2:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan https://www.pkpd168.com/bear
Download link (updated) -> here

ElMaestro
★★★

Denmark,
2009-09-08 23:43
(5761 d 15:49 ago)

@ yjlee168
Posting: # 4174
Views: 8,740

Default in SAS

Post reply

Dear yjlee,

❝ O.k., does this sequential phenomenon occur with SAS for type I SS calculation? The suggested SAS codes for 2x2x2 crossover is

(...)

❝ Looks like that we need to put seq before subj...

Yes to the first question. Type I is defined that way, so you will see the same phenomenon with SAS type I (unless SAS is does something clever as it does for the single term deletion stuff; this I don't know about as I do not use SAS).

The SAS default I think includes type I and type III in the output. You can put Seq before Subj in type I analyses if you prefer. As explained previously that will ensure you get 'something' for the Seq SS.

—
Pass or fail!
ElMaestro