Let's skip the fancy nesting syntax! [🇷 for BE/BA]

posted by ElMaestro  – Denmark, 2009-08-26 22:54 (5718 d 03:15 ago) – Posting: # 4118
Views: 15,804

Ahoy all,

I have read and I have learned. Therefore, I hereby declare war on syntactical misnomers like "Subject % in % Sequence" or "Subject : Sequence".
If case you want to know why, then read on, otherwise please continue drinking your coffee or reading your newspaper. The following is just another elmaestrolophystic opinion, which means it is written by someone with a brain the size of a walnut, most of it is probably wrong, based on misunderstandings and miscalculations, or just reflecting Belgian seamanship.

Most people doing BE know that in crossover studies a given subject is uniquely assigned to a specific sequence. This is a type of nesting, and there is a certain schwung about speaking of 'subjects nested in sequence'. It sounds good. It sounds fancy. It might even impress the cute blonde two offices down the corridor if I speak bull like that loud enough. Certainly there are some reason...
Let's look at it from R's perspective, and let me discuss two ways of specifying subjects:
  1. We can assign each subject in the trial a unique identifier (like 1, 2...n where n is the total number of subjects).
  2. For each sequence we can give each subject a unique identifier (like 1, 2...n where n is the total number of subjects in that sentence). This implies that that there could be a subject in sequence 1 that is called "1" and there could at the same time be one in sequence 2 called "1". And so forth.
Both are fully valid ways of working with subjects, and R provides functionality for both ways.

When we use option 2 above we need to make sure R understands that subject 1 in sequence 1 and subject 1 in sequence 2 are not identical. Not identical in this regards means we need to have separate model coefficients for both. And this is exactly where the nesting syntax is useful- We would specify a model using muddle lingo like "lnAuc ~ Subj % in % Seq + Seq + Per + Trt". If we have 30 subjects in 2 sequences then it means R with that kind of syntax will in principle (it does reduce some redundancies away, though) try to fit 2x30=60 coefficients for the interaction of Subject and Sequence. (subject 1 in sequence 1, subject 1 in sequence 2, subject 2 in sequence 1 ... subject 30 in sequence 2).
And hey, yes, I deliberately used the horrible word "interaction" in the previous sentence. When R fits a model it creates a model matrix for us, although we do not see it unless we ask for it (model.matrix(MyFunkyFit)). Nesting in a model matrix is the same as interactions. This is why we get same results if we use "lnAuc ~ Subj : Seq + Seq + Per + Trt". However, interactions do not care about order of factors. That's why when we finally do the ANOVA and we have used "Subj : Seq" then R reserves its right to give us the result expressed as a p-value for "Seq : subj" rather than "Subj : Seq" (example here). Subjects nested in Sequences is just the same as Sequences nested in Subjects to a numbercruncher working on bits and bytes. However, to me in practice, speaking of Subjects in Sequence makes some sense whereas speaking of Sequences in Subjects is meaningless.

Everything is so much easier when we specify subjects uniquely as under option 1 above. Now we can just ask R to fit the model specified as "lnAuc ~ Subj + Seq + Per + Trt" or whatever order of factor we prefer (if we have a preference at all). That way we do not need to specify the nesting at all. And what happens if we do? Well, although in the end the ANOVA table is right, poor little R is forced to do serious overtime, because as mentioned above it will try and fit with a model matrix having a potentially high number of columns that correspond to coefficients that cannot be calculated because the subjects do not exist (you will see around n annoying "NA"s in the model summary when n unique subjects are evaluated that way).

So, next time you try to impress the blonde down the corridor with mumbojumbo like "subjects nested in sequence", make sure you have a pretty good excuse ready in case she asks you why the heck you are using an approach that is intended for non-unique subject identifiers.


OK, now back to the seven seas.
EM.


Wow! Giving me something to chew upon. Some translations for the German word "Schwung" in the leading section (just before the cute blondes): swing, kick, impetus: If you are not happy with that, have a look here for other translation. [Helmut]

Complete thread:

UA Flag
Activity
 Admin contact
23,424 posts in 4,927 threads, 1,670 registered users;
24 visitors (0 registered, 24 guests [including 4 identified bots]).
Forum time: 02:09 CEST (Europe/Vienna)

The difference between a surrogate and a true endpoint
is like the difference between a cheque and cash.
You can get the cheque earlier but then,
of course, it might bounce.    Stephen Senn

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5