Bioequivalence and Bioavailability Forum

ElMaestro
★★★

Denmark,
2014-02-25 09:43
(4153 d 21:20 ago)

(edited on 2014-02-25 14:15)
Posting: # 12488
Views: 8,509

Normal linear model 101 [General Statistics]

Hi all,

follow a recent exchange with Angus McLean, here's how I understand the normal linear model.

Let's consider the case of a 222-BE study.

I will write it a little differently than C&L do, and I do that because I understand it better this way - there i no functional difference, just a matter of notation. An explanation why this is will follow.

We write
y_ijkl=intercept+Seq_i+Per_j+Trt_k+Subj_l+error_ijkl

which means: "The observation y [can be log(AUC) or log(Cmax] is the intercept plus the sequence constant of the i'th sequence which was assigned to subject l, plus the j'th period constant corresponding to the period in which y was sampled, plus a treatment constant for the k'th treatment corresponding to the treatment plus the l'th subject constant corresponding to the subject from which y was sampled plus an error".

So in a 222-BE trial with N completers we have 2 sequence constants, 2 period constants, 2 treatment constants, and N subject constants.

In the following I will look at a dataset like this

[image]

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2014-02-25 09:45
(4153 d 21:18 ago)

@ ElMaestro
Posting: # 12489
Views: 7,542

Ch1

Post reply

In matrix notation will write

y=Xb+e

y is a vector of observations.
X is the design matrix.
b is the effects vector.
e is the error or residual vector.

Let's somehow try to get the dataset above transmogriffed into some kind of matrix notation.

I will not go through matrix notation or mutiplication here. Google it, or use Wikipedia. Along the same lines, I will not go through anything that leads to the least squares solution of the normal linear model, but I will mention without proof that:

If the error is normal with mean zero and some variance then the most likely set of constants in b is those for which the sum of squared residuals e^te is minimal.
That solution is analytically
b=(X^tX)^)-1X^ty
This solution implies inversion of the matrix X^tX. If this matrix cannot be inverted then we have screwed up.
For all practical purposes, if X^tX is invertible, then we have a solution. This is the case when the rank of X equals the number of columns in X.
If any columns of X add up to give another column (or sum of columns) in X then the rank is not equal to the number of columns, and we cannot get a solution.

We can write the equations like this (focus for now only on sequence and observations):

17.11=1*intercept + 1*Seq_TR + 0*Seq_RT + …
17.76=1*intercept + 1*Seq_TR + 0*Seq_RT + …
…
…
…
22.87=1*intercept + 0*Seq_TR+ 1*Seq_RT + …
20.80=1*intercept + 0*Seq_TR + 1*Seq_RT + …

Let's transmogriph this into a model matrix. In X we have a column for each constant we wish to determine. Above we just worked with three constants (intercept, and two sequence constants).
In the rows of the intercept put a one if the observations is associated with an intercept. All observations are so this is a column of pure 1's.
In the rows of Sequence TR put a 1 when the observation comes from a subject in sequence TR otherwise a 0. And so forth for Sequence RT.

We can write the model matrix (shown right in reddish color) like:
[image]

Note the bold constants just above: They correspond 1:1 to how they appear in X.

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2014-02-25 10:15
(4153 d 20:48 ago)

(edited on 2014-02-25 10:33)
@ ElMaestro
Posting: # 12490
Views: 7,529

Ch2

Post reply

But there's trouble ahead.

Look at column 2 and 3 of X (for sequence TR and RT, respectively). If you add them, then you get a column of pure 1's which is the same as column 1 for the intercept.

In other words, X writen just above does not satisfy the rules for finding a solution, cf. rules above. Take away e.g. the column for SeqRT and then you can see that addition of one column to some other column(s) does not result in some other column(s). When we've taken away the SeqRT column, then the rank is therefore two, and we have two columns, and therefore X^tX is invertible, and therefore we have a solution.
In other words, We have two sequences but we can only fit one constant for sequence (a constant for either SeqTR or SeqRT, assuming we insist on having an intercept). In other words: We just lost a degree of freedom. And we are not getting it back.

So, the unfinished X now looks like this:

[image]

Now let's add two columns for period just using the same principles (X to the right):

[image]

And again, this X is sick, ill and syphilitic because the two period columns add up to the intercept. We need to get rid of one of these period columns and then all is good again.
[image]

Now X^tX is invertible and we're in the clear.

We do the same for treatment (can't have two columns for treatment since they add up to the intercept) and get:
[image]

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2014-02-25 10:51
(4153 d 20:12 ago)

@ ElMaestro
Posting: # 12491
Views: 7,438

Ch3

Post reply

Now all that is left is to add columns for subject:

[image]

(first two subject columns shown in X)

If we have N subjects then we'd initially try to add N columns to X, but it turns out that we need to eliminate 2 of the subject columns in order to keep X^tX invertible (corresponding to a full rank X) so that we can find the maximum likelihood solution.

The status is

We wanted a model with an intercept and four fixed factors called Sequence, Period, Treatment, Subject.
Intercept has 1 df.
We lost one DF for sequence. Sequence has 1 df.
We lost one DF for period. Period has 1 df.
We lost one DF for treatment. Treatment has 1 df.
We lost two DFs for subject. Subject has N-2 df's.

The vector b will thus have one row for each column in X, in total there will be 1+1+1+1+(N-2) rows in b.

Nifty stuff: If you know the subject, the period, the treatment for any observation then you know tha subject's sequence. And so forth. This is why the way I wrote the model in the beginning of this thread can be smartified like C&L did it. ~ A loss of a df means something can be figured out if we just have access to auxiliary info.

But come on...
We tried to get two columns for Sequence but we had to remove one due to the intercept. Same for Period and Treatment. For Subject it got even worse. We can just fit the model without intercept, right? Then all trouble is gone!

Sort of. Then we can add two columns for Sequence (if this is the first term we deal with), now the two sequence columns add up to pure 1's, so we cannot add two period columns, and we cannot add two treatment columns, but just one of each again.

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2014-02-25 11:14
(4153 d 19:49 ago)

@ ElMaestro
Posting: # 12492
Views: 7,617

Ch4 - the good, the bad and the ugly

Post reply

All is good so far.

But here's the bad: The dark forces are upon us. The lurk on every corner and we cannot escape the consequences of their hideous actions. They talk about... ... oh, forgive the typos, for my hands are shaking the very moment I think of it... they talk about "Subject nested in Sequence" and feel smug about it because it sounds fancy.

AAAAAAAAAAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaarggggh. Gimme some garlic and some holy water. Make them go away, send them to work in the salt mines for 50 years, anything. ANYTHING!

When we talk about "Subject nested in Sequence" or "Subject in Sequence" or "Subject(Sequence)", which is all the same, then it means we try to introduce columns in X corresponding to any combo of Subject (N levels) AND Sequence (2 levels).
Let's just look at a few columns in X for subject 1 in sequence TR and subject 1 in sequence RT:
[image]

See? This is the ugly. There was only one subject called "Subject 1", she was assigned to sequence TR, so it makes absolutely no sense to speak of subject 1 in sequence RT; the corresponding column will be one of pure zero's which does not convey any info of any sort (if 0*a = 0 then what can you tell me about a?).

Does it means that people who talk about "Subject nested in Sequence" are full of sh!t?
If you ask me, then by and large yes. In some cases it makes a little sense, though.
Imagine we do not call our subjects 1,2,3 ... N like in the examples above.

In stead we assign number 1,2,3... to each subject in sequence TR.
And we assign number 1,2,3... to each subject in sequence RT.
And we make sure to keep those numbers apart. Talking about "subject 1" is now no longer sufficent for identification, so to identify the subject we need to know which sequence he/she was assigned to.

If you number your subjects this way and you construct your X following the rules above then it will make a little sense to do it, and you will arrive at EXACTLY the same columns in X as if you do it with unquely numbered subjects.

In my years of working with clinical documentation I have never ever seen a CRO or clinic use anything but uniquely numbered subject identifiers. Actually these are always the randomsation codes and these are always unique, right? This is why it really doesn't make practical sense to speak of "Subject nested in Sequence". It isn't wrong per se but it involves a lot ridiculous and meaningless matrix algebra which doesn't give any info at all.
So stay good: if you work with unique randomisation numbers assigned to the subjects (and you do, I know you do!) then just forget about the nesting stuff.

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2014-02-25 11:23
(4153 d 19:40 ago)

@ ElMaestro
Posting: # 12493
Views: 7,503

Ch5

Post reply

So we had a dataset, we got X constructed so that it was full rank and we didn't waste any time on meaningless babble about subjects nested in sequence. The number of columns of X is the rank, which is the number of rows in b, which is the number of constants we have freedom to determine. We're doing this by minimising e^te. The story ends here, submit, get approval, celebrate.

Almost. I often do not fit with the intercept, and then I specify Treatment as the first factor. This means we do not have an interceept column in X, and the first two columns will be the two treatment columns. This is very handy, because then the first two estimates or constants in b are the treatment effects: One for Test and one for Reference.
So we just extract these to estimates from the b-vector and use them directly for the construction of the confidence interval. Note that the residual sum of squares is not affected by this. It doesn't matter which factors you specify first or whether you want an intercept. If you have one way or another Period, Subject, Sequence, Treatment in your model, then regardless of the permutation of the order of them the mean squared error is unaffected.

—
Pass or fail!
ElMaestro

AngusMcLean
★★

USA,
2014-03-01 18:24
(4149 d 12:39 ago)

@ ElMaestro
Posting: # 12540
Views: 7,433

Normal linear model 101

Post reply

The question arises if the English expression and comprehension is more difficult than the MATHEMATICS. Now for the 12 year old.

My understanding is the mixed model (in my case from Phoenix WinNonlin) predicts the value of the PK parameter by a regression model in the terms that you describe. You then can compare with the observed value of the parameter.

Now I think this is correct, but by the way why do we call it a mixed model? I am trying to shape my understanding of this activity better

ANGUS

ElMaestro
★★★

Denmark,
2014-03-01 21:53
(4149 d 09:10 ago)

@ AngusMcLean
Posting: # 12542
Views: 7,424

Mixed Muddle

Post reply

Hi Angus,

❝ My understanding is the mixed model (in my case from Phoenix WinNonlin) predicts the value of the PK parameter by a regression model in the terms that you describe. You then can compare with the observed value of the parameter.

In a normal linear model you have a whole bunch of fixed effects (things that have discrete levels, but do not vary continously: gender (Male vs female), treatment (T vs R), Hair color (blond vs red vs blue etc), Season (winter vs summer) and so forth. And then you have an uncertainty or error that follows a normal distribution with mean zero.

In a mixed model you...well... mix things up. You introduce additional random effects. So suddenly our model expression does not just encompass one thing (like the error in a normal linear model) which follows a normal dist. but more than one. In that case, minimisation of sums of squares does not give the max. likelihood. In stead the max. likelihood (actually 'restricted') must now be calculated by an optimiser.

❝ Now I think this is correct, but by the way why do we call it a mixed model? I am trying to shape my understanding of this activity better

Normal linear model: Simple stuff, a single normal distribution as well as several constants explain the variation in our data.
Linear mixed model: Complex stuff, several constants plus two or more normal distributions must be used to account for the data scatter.

Subject is sometimes, actually quite often, seen appropriately as a random effect. We are not interest in knowing how Schützomycin performs in just 20 subjects. We are interested in extending the knowledge we have about Schützomycin to the entire population (of which our 20 test subjects are hopefully representative).
In ordinary 222-BE Subject can be modeled as fixed (giving a normal linear model) or as random (giving a mixed effects model), both having the exact same result (one exception: a missing period value; the entire subject is thrown out in least squares minimisation but if you apply a mized model you can find a max likelihood even with such subjects taken into account).
I am not a WinNonlin user, but I think Phoenix will perform the calculation as a mixed model by default unless you ask it specifically to do it with all effects as fixed.

—
Pass or fail!
ElMaestro

AngusMcLean ★★ USA, 2014-03-02 18:42 (4148 d 12:21 ago) @ ElMaestro Posting: # 12545 Views: 7,332	Mixed Muddle Post reply
	Thanks My guess is you are with SAS and you have a statistical background. I find the descriptions in books are often poor. The way WinNonlin model is set up is as follow: Fixed effects are sequence +treatment +period and the Subject(Sequ) is a random effect. I think this ismOK, Angus

ElMaestro
★★★

Denmark,
2014-03-02 19:10
(4148 d 11:53 ago)

@ AngusMcLean
Posting: # 12547
Views: 7,302

Mixed Muddle

Post reply

Hi Angus,

❝ My guess is you are with SAS and you have a statistical background.

Hehe, that's actually as wrong as it can be. I had a basic course in statistics about 100 years ago. Graphical estimation of standard deviations from observed data and such. Never had WinNonlin or SAS on my system and I doubt I will ever have :ok:

. Only R, which I like. Except for the syntax :-D

(what the h%ll does "~1|blah" really mean in R???)

❝ I find the descriptions in books are often poor.

True... ever successfully tried to read and understand the little instruction books that come with Ikea furniture?
"Congratulations on buying a new Spatzwoolah. To assemble it use the fnoofbygthwat to turn screw A 70 degrees counterclockwise with your left hand while holding the Grabowski-pin with your right hand. Make sure not to turn the screw more than 40 degrees. When you hear a click, firmly grab the handles on the upper jaweenmazonkie, but do not let go of the Grabowski-pin or the fnoofbygthwat since these hold the zwix in place behind the lower left plinaque which should be supported by your free hand so as not to crack" and so on.

❝ The way WinNonlin model is set up is as follow:Fixed effects are sequence +treatment +period and the Subject(Sequ) is a random effect.

By default. I think that's correct although I do not know.
By the way: Subject(Sequ) gets my systolic blood pressure up to 520 mmHg. Nesting syntax isn't relevant because you're assigning random numbers to trial subjects, and no two subjects can have the same number.

❝ I think this ismOK.

Well... yes, often it is, but strictly is might not be... At EMA we have: "The terms to be used in the ANOVA model are usually sequence, subject within sequence, period and formulation. Fixed effects, rather than random effects, should be used for all terms."

In the US they will generally use PROC GLM / SAS or R's lm to reproduce the results for 222-BE studies, as far as I know. That means that if your dataset from such a trial has missing values (as in: some subjects did not have data for both periods but ony for one) then you might get a question about, since FDA's analysis result will be different from yours.
I'd only really gamble on a mixed model if we talk non-EMA and replicated or semi-replicated designs.

—
Pass or fail!
ElMaestro

Normal linear model 101 [General Sta­tis­tics]

Ch1

Ch2

Ch3

Ch4 - the good, the bad and the ugly

Ch5

Normal linear model 101

Mixed Muddle

Mixed Muddle

Mixed Muddle

Normal linear model 101 [General Statistics]