lm and the model matrix [🇷 for BE/BA]

posted by ElMaestro  – Denmark, 2024-02-19 10:51 (429 d 02:37 ago) – Posting: # 23868
Views: 3,830

Hi all,


I continue to be a little baffled by R's lm and related functions.

Let's say we have a typical BE-related model like
M=lm(log(Cmax) ~ factor(Seq)+factor(Subj)+factor(Trt)+factor(Per))

I want a full rank model matrix, for example because further downstream I need access to the projection matrix (or hat matrix, H). The matrix, not just the diagonal entries. Should be easy, right?
H=X(X'X)-1X'
, where X is the full rank model matrix.
So I just make a call like
X=model.matrix(M)

and we plug that into the equation for H.

But no.... Unfortunately, R doesn't let me get away with it, because the X I extract from the call above is not full rank. Neither is does it contain a column for every level of every factor. It is somehwere in between the two extremes and I do no understand how that X offers any advantage at all.

Next, ok, there's probably an easy way to make X full rank, right? A simple call in R's stats library must definitely exist, right?
Well, perhaps you guys know it, but I don't. I find myself having to write a stupid function like
MakeFullRank=function(X)
{
  Q=X[,1]
  for (i in 2:ncol(X))
  {
    rnk=qr(Q)$rank
    Q2=cbind(Q, X[,i])
    rnk2=qr(Q2)$rank
    if (rnk2>rnk) Q=Q2
  }
  return(Q)
}

This will convert the X into a full rank X.

It all seems so odd to me... R is such a functional language and all sorts of functions are optimised, and offer all sorts of handy functionality, so:
  1. Can anyone tell if R's model.matrix (not full rank and not containing a column for every factor level for every factor) offers any advantage within an lm?
  2. Can anyone tell if there are easier or direct ways to extract a full rank model matrix from an lm object?
  3. Can anyone tell if there are easier ways to extract the hat matrix (not just its diagonals) than the clumsy outline above?
Finally, let me emphasize I do not wish for a discussion of Moore-Primrose pseudoinverses or the opportunities those beasts offer. In essence this is plainly and simply about lm and inverses, not pseudoinverses. Should anyone care to answer then give pt 3 lowest priority, please. The hat matrix deal is only one example showing why I need a full rank X. There are plenty other reasons as well. But if I understand why creators of lm prefer to work on deficienct-rank model matrices then perhaps I can apply this understanding in my own setting as well :-)

Many thanks for any input.

Pass or fail!
ElMaestro

Complete thread:

UA Flag
Activity
 Admin contact
23,424 posts in 4,927 threads, 1,671 registered users;
27 visitors (0 registered, 27 guests [including 0 identified bots]).
Forum time: 14:28 CEST (Europe/Vienna)

The difference between a surrogate and a true endpoint
is like the difference between a cheque and cash.
You can get the cheque earlier but then,
of course, it might bounce.    Stephen Senn

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5