lm and the model matrix [🇷 for BE/BA]

posted by ElMaestro  – Denmark, 2024-02-19 10:51 (91 d 00:23 ago) – Posting: # 23868
Views: 1,123

Hi all,

I continue to be a little baffled by R's lm and related functions.

Let's say we have a typical BE-related model like
M=lm(log(Cmax) ~ factor(Seq)+factor(Subj)+factor(Trt)+factor(Per))

I want a full rank model matrix, for example because further downstream I need access to the projection matrix (or hat matrix, H). The matrix, not just the diagonal entries. Should be easy, right?
, where X is the full rank model matrix.
So I just make a call like

and we plug that into the equation for H.

But no.... Unfortunately, R doesn't let me get away with it, because the X I extract from the call above is not full rank. Neither is does it contain a column for every level of every factor. It is somehwere in between the two extremes and I do no understand how that X offers any advantage at all.

Next, ok, there's probably an easy way to make X full rank, right? A simple call in R's stats library must definitely exist, right?
Well, perhaps you guys know it, but I don't. I find myself having to write a stupid function like
  for (i in 2:ncol(X))
    Q2=cbind(Q, X[,i])
    if (rnk2>rnk) Q=Q2

This will convert the X into a full rank X.

It all seems so odd to me... R is such a functional language and all sorts of functions are optimised, and offer all sorts of handy functionality, so:
  1. Can anyone tell if R's model.matrix (not full rank and not containing a column for every factor level for every factor) offers any advantage within an lm?
  2. Can anyone tell if there are easier or direct ways to extract a full rank model matrix from an lm object?
  3. Can anyone tell if there are easier ways to extract the hat matrix (not just its diagonals) than the clumsy outline above?
Finally, let me emphasize I do not wish for a discussion of Moore-Primrose pseudoinverses or the opportunities those beasts offer. In essence this is plainly and simply about lm and inverses, not pseudoinverses. Should anyone care to answer then give pt 3 lowest priority, please. The hat matrix deal is only one example showing why I need a full rank X. There are plenty other reasons as well. But if I understand why creators of lm prefer to work on deficienct-rank model matrices then perhaps I can apply this understanding in my own setting as well :-)

Many thanks for any input.

Pass or fail!

Complete thread:

UA Flag
 Admin contact
23,033 posts in 4,835 threads, 1,642 registered users;
50 visitors (0 registered, 50 guests [including 10 identified bots]).
Forum time: 12:15 CEST (Europe/Vienna)

Competence, like truth, beauty and contact lenses,
is in the eye of the beholder.    Laurence J. Peter

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz