ElMaestro ★★★ Denmark, 20240219 10:51 (53 d 03:53 ago) Posting: # 23868 Views: 851 

Hi all, I continue to be a little baffled by R's lm and related functions. Let's say we have a typical BErelated model like M=lm(log(Cmax) ~ factor(Seq)+factor(Subj)+factor(Trt)+factor(Per)) I want a full rank model matrix, for example because further downstream I need access to the projection matrix (or hat matrix, H). The matrix, not just the diagonal entries. Should be easy, right?
H=X(X'X)^{1}X' , where X is the full rank model matrix. So I just make a call like X=model.matrix(M) and we plug that into the equation for H. But no.... Unfortunately, R doesn't let me get away with it, because the X I extract from the call above is not full rank. Neither is does it contain a column for every level of every factor. It is somehwere in between the two extremes and I do no understand how that X offers any advantage at all. Next, ok, there's probably an easy way to make X full rank, right? A simple call in R's stats library must definitely exist, right? Well, perhaps you guys know it, but I don't. I find myself having to write a stupid function like MakeFullRank=function(X) This will convert the X into a full rank X. It all seems so odd to me... R is such a functional language and all sorts of functions are optimised, and offer all sorts of handy functionality, so:
Many thanks for any input. — Pass or fail! ElMaestro 
mittyri ★★ Russia, 20240222 21:21 (49 d 17:22 ago) @ ElMaestro Posting: # 23873 Views: 638 

Dear ElMaestro! I (and many others) understand your degree of discontent. Yes, this is something unexpected. I suppose you already read this answer I have nothing to add to this: that's a price for identifiability. The only solution I see besides you already gave is to turn off the contrasts as shown there: f < sample(gl(3, 4, labels = letters[1:3])) — Kind regards, Mittyri 
ElMaestro ★★★ Denmark, 20240223 18:41 (48 d 20:03 ago) @ mittyri Posting: # 23878 Views: 607 

Hi Mittyri, ❝ I have nothing to add to this: that's a price for identifiability. The only solution I see besides you already gave is to turn off the contrasts as shown there: ❝
❝ f = contr.treatment(n = 3, contrasts = FALSE), Nice, but are you aware that this model matrix is also not full rank, but rankdeficient? A cardinal question was how to get the (or a) full rank matrix. The rank of your X0 is 5 but there are 7 columns. Seems you went to the opposite extreme of full rank, i.e. one column for every level of factors, plus intercept. Deficiency here = 2 redundant columns. Anyways, I guess I will just live with it and do my stuff in a dumb but necessary fashion. Keep me posted if you find a nifty way of generating a full rank matrix model, please. — Pass or fail! ElMaestro 
mittyri ★★ Russia, 20240223 20:39 (48 d 18:04 ago) @ ElMaestro Posting: # 23879 Views: 605 

Hi ElMaestro, may be I misunderstood your ❝ is not full rank. Neither is does it contain a column for every level of every factor. > (MakeFullRank(model.matrix(~ f + g))) — Kind regards, Mittyri 
ElMaestro ★★★ Denmark, 20240223 23:15 (48 d 15:29 ago) (edited by on 20240224 00:00) @ mittyri Posting: # 23880 Views: 603 

Hi Mittyri, ❝ may be I misunderstood your ❝ ❝ ❝ is not full rank. Neither is does it contain a column for every level of every factor. ❝ The solution you proposed with qr() does not give a column for every level of every factor ❝
❝ Q
A full rank model matrix means there are not more columns than the rank. Then X^{t}X becomes invertible. A model matrix with a column for every level of every factor (plus intercept) is not the same. Rather that's redundancy. X^{t}X now becomes trouble My function creates a full rank model matrix. — Pass or fail! ElMaestro 