Just forget ANOVA. What next? [RSABE / ABEL]

posted by ElMaestro  – Denmark, 2012-10-15 12:04 (4632 d 19:31 ago) – Posting: # 9416
Views: 11,221

Dear d_labes,

❝ 2. First you told me: Forget type III. Now you tell me: Forget ANOVA. What next to forget?

Erm... this is a matter of personal preference. The parking ticket you got last week? The doctor's order to trash the cigarettes? Wife's birthday?

❝ 3. I personally have not and will not trust in any model parameters (aka effect estimates) from a model which doesn't fit. And not getting one ore more coefficients of the model (getting '.' in SAS or 'NA' in R-speak) is for me an indication of no fit. A strong indication.


No fit isn't in question. All the reasonable models you can think of in this regard will fit.
I understand your scepticism (scepticalism? sceptolophysm? Dammit my English is getting worse with time, sorry). Anyways, let me show you an example in R to illustrate what I mean. This is just a made up dataset for a 2-seq, 2-treat, 2-per BE study with three subjects in each sequence:
lnPKdata=c(18,19,23,22,27,18,16,15,20,20, 26,17)
Per=as.factor(c(1,2,1,2,1,2,1,2,1,2,1,2))
Seq=as.factor(c("TR", "TR", "TR", "TR", "TR", "TR", "RT", "RT", "RT","RT", "RT", "RT"))
Trt=as.factor(c("T", "R","T", "R","T", "R","R", "T","R", "T","R", "T"))
Subj=as.factor(c(1,1,2,2,3,3,4,4,5,5,6,6))


Let's say you do a classical thing like:
M=lm(lnPKdata~Trt+Seq+Subj+Per)

Now, in this model there is no effect for Treatnment T, but this doesn't mean it doesn't exist or didn't fit: It just happens that we wanted an intercept, so we cannot have both a column for treatment T and treatment R in the model matrix. We just get one of them (in this case R makes the choice, it will be the effect for TrtT). So what about TrtR? Well, an observation that isn't associated with TrtT mst be TrtR. And since the intercept is a column of 1's, we can subtract the column for TrtT from the column of the intercept and voila there we have a column that would indicate TrtR. So, hokus pokus and rabbits out of a hat, try and subtract the effect for TrtT from the intercept (effect).
You can fit the same model without intercept and then you can extract the effects for TrtT and TrtR more directly.

A more complicated example to illustrate the logical beauty of model matrices, estimability and matrix algebra:

From the fit above we get:
Coefficients:
(Intercept)         TrtT        SeqTR        Subj2        Subj3        Subj4 
    23.1667      -0.1667      -3.0000       4.0000       4.0000      -6.0000 
      Subj5        Subj6         Per2 
    -1.5000           NA      -3.1667


Eh.... So the effect for Subject 6 doesn't exist? Is not estimable??
Well, it does exist but from the way the model is specified and the way R constructs a model matrix, it is not fee to vary. Try and look at the internal model matrix:

model.matrix(M)
   (Intercept) TrtT SeqTR Subj2 Subj3 Subj4 Subj5 Subj6 Per2
1            1    1     1     0     0     0     0     0    0
2            1    0     1     0     0     0     0     0    1
3            1    1     1     1     0     0     0     0    0
4            1    0     1     1     0     0     0     0    1
5            1    1     1     0     1     0     0     0    0
6            1    0     1     0     1     0     0     0    1
7            1    0     0     0     0     1     0     0    0
8            1    1     0     0     0     1     0     0    1
9            1    0     0     0     0     0     1     0    0
10           1    1     0     0     0     0     1     0    1
11           1    0     0     0     0     0     0     1    0
12           1    1     0     0     0     0     0     1    1


Look at the column for subject 6. Then mentally, subtract SeqTR from Intercept, and subtract Subj5 and Subj4 from that. What do you get? Subject 6! Therefore, there is collinearity in the above model matrix and the rank (the df's) cannot be 9 (=the number of columns; you can try and qr the model matrix to confirm that the rank is 8 rather than 9). The model matrix displayed here is thus not the full rank matrix the linear model requires so we need to decrease by a df which in this case is subject 6. Thus to get the effect for Subj 6, even though the model could not "fit" it, we just add and subtract a little of the other effects to get it.

Neat, eh? Just really stupid that R displays a model matrix that is not full rank. It is not very pedagogical.

I guess all in all we can say:
1. The model specification determines the model matrix.
2. The model matrix in a linear model must be full column rank.
3. Fitting with an intercept steals exactly one df from some other factor. Usually the first factor mentioned but this depends on the language implementation.
4. The order of factors may determine the number of df's associated with each of them in a standard fit.
5. We can get effects from treatments that appear gone and lost, by playing around with the effects until we get a match of model latrix columns for the effect we are interested in.

Now back to the seven seas.
:pirate:








:pirate:

Pass or fail!
ElMaestro

Complete thread:

UA Flag
Activity
 Admin contact
23,424 posts in 4,927 threads, 1,671 registered users;
28 visitors (0 registered, 28 guests [including 8 identified bots]).
Forum time: 07:35 CEST (Europe/Vienna)

Medical researches can be divided into two sorts:
those who think that meta is better and those
who believe that pooling is fooling.    Stephen Senn

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5