Bioequivalence and Bioavailability Forum • lm and the model matrix

lm and the model matrix [🇷 for BE/BA]

posted by ElMaestro – Denmark, 2024-02-19 10:51 (484 d 04:45 ago) – Posting: # 23868
Views: 4,674

Hi all,

I continue to be a little baffled by R's lm and related functions.

Let's say we have a typical BE-related model like
M=lm(log(Cmax) ~ factor(Seq)+factor(Subj)+factor(Trt)+factor(Per))

I want a full rank model matrix, for example because further downstream I need access to the projection matrix (or hat matrix, H). The matrix, not just the diagonal entries. Should be easy, right?


H=X(X'X)^-1X'

, where X is the full rank model matrix.
So I just make a call like

X=model.matrix(M)

and we plug that into the equation for H.

But no.... Unfortunately, R doesn't let me get away with it, because the X I extract from the call above is not full rank. Neither is does it contain a column for every level of every factor. It is somehwere in between the two extremes and I do no understand how that X offers any advantage at all.

Next, ok, there's probably an easy way to make X full rank, right? A simple call in R's stats library must definitely exist, right?
Well, perhaps you guys know it, but I don't. I find myself having to write a stupid function like

MakeFullRank=function(X)

{

  Q=X[,1]


  for (i in 2:ncol(X))

  {

    rnk=qr(Q)$rank

    Q2=cbind(Q, X[,i])

    rnk2=qr(Q2)$rank

    if (rnk2>rnk) Q=Q2

  }

  return(Q)

}

This will convert the X into a full rank X.

It all seems so odd to me... R is such a functional language and all sorts of functions are optimised, and offer all sorts of handy functionality, so:

Can anyone tell if R's model.matrix (not full rank and not containing a column for every factor level for every factor) offers any advantage within an lm?
Can anyone tell if there are easier or direct ways to extract a full rank model matrix from an lm object?
Can anyone tell if there are easier ways to extract the hat matrix (not just its diagonals) than the clumsy outline above?

Finally, let me emphasize I do not wish for a discussion of Moore-Primrose pseudoinverses or the opportunities those beasts offer. In essence this is plainly and simply about lm and inverses, not pseudoinverses. Should anyone care to answer then give pt 3 lowest priority, please. The hat matrix deal is only one example showing why I need a full rank X. There are plenty other reasons as well. But if I understand why creators of lm prefer to work on deficienct-rank model matrices then perhaps I can apply this understanding in my own setting as well :-)

Many thanks for any input.

—
Pass or fail!
ElMaestro

Complete thread:

lm and the model matrixElMaestro 2024-02-19 09:51
- the model matrix and the contrasts mittyri 2024-02-22 20:21
  - Yes, but.... ElMaestro 2024-02-23 17:41
    - Yes, but.... mittyri 2024-02-23 19:39
      - Yes, but.... ElMaestro 2024-02-23 22:15