Speed improvement [🇷 for BE/BA]
Hi again,
now it gets really interesting.
I made two observations:
1. There is this 30% improvement if I switch to profile log likelihood in stead of "the big one" for lac kof betyter wording
2. I can achieve about an improvement of several thousand percent if stay with the big X and V, however, I must then use a smart tactic:
I am only generating V once, but though each pass on the optimiser I am updating V rather than re-writing it.
Instead of iterating across all rows and column and checking in the data list where to put the individual components, I am initially generating a list of coordinates for each variance component. Something like this:
This returns a list with four components. The first component is a set of coordinates in V for varT, the second is a set of coordinates for varBR. And so forth.
Then, in the REML function, I am having a block like this:
And now are simply talking REAL BUSINESS
The optimiser is returning blazingly fast with an REML estimate and corresponding variance components. Depending on the tolerance required it is almost instantaneous (<0.5 secs) on my laptop and that machine isn't a high tech equipment at all. The speed with "big V" far outruns the profile approach.
I believe this implies that matrix inversion is done in a clever fashion in R's solve function; as you pointed out PharmCat inversion can be slow (and unstable) as matrices get large, but perhaps the LAPACK routine used by R reduces the task intelligently, possibly through some of the same tricks described for e.g. the Alglib routines..
However, I am also seeing a different little thing:
If I optimise with the profile approach I can make BFGS converge, but this will not work if I try it it in the "big V" approach. Lindstrom and Bates noticed something similar in the sense that they wrote that Quasi-Newton methods perform better when the profile approach is used:
"We recommend optimizing the profile log- likelihood, since it will usually require fewer iterations, the derivatives are somewhat simpler, and the convergence is more consistent. We have also encountered examples where the NR algorithm failed to converge when optimizing the likelihood including a but was able to optimize the profile likelihood with ease."
So, lots of options and I am not going to be late for supper
now it gets really interesting.
I made two observations:
1. There is this 30% improvement if I switch to profile log likelihood in stead of "the big one" for lac kof betyter wording
2. I can achieve about an improvement of several thousand percent if stay with the big X and V, however, I must then use a smart tactic:
I am only generating V once, but though each pass on the optimiser I am updating V rather than re-writing it.
Instead of iterating across all rows and column and checking in the data list where to put the individual components, I am initially generating a list of coordinates for each variance component. Something like this:
V.Coordinates=function(Data)
{
Nobs=nrow(Data)
xy.varT=NULL; xy.varBR=NULL; xy.varWR=NULL; xy.varBRT=NULL;
for (iRow in 1:Nobs)
for (iCol in 1:Nobs)
{
## the diagonal
if (iCol==iRow)
{
if (Data$Trt[iRow]=="T") xy.varT=rbind(xy.varT, c(iRow, iCol))
if (Data$Trt[iRow]=="R") xy.varWR=rbind(xy.varWR, c(iRow, iCol))
}
## off diagonal
if (iCol!=iRow)
if (Data$Subj[iRow]==Data$Subj[iCol])
{
if (Data$Trt[iCol]==Data$Trt[iRow]) xy.varBR=rbind(xy.varBR, c(iRow, iCol))
if (Data$Trt[iCol]!=Data$Trt[iRow]) xy.varBRT=rbind(xy.varBRT, c(iRow, iCol))
}
}
return(list(xy.varT=xy.varT, xy.varBR=xy.varBR, xy.varWR=xy.varWR, xy.varBRT=xy.varBRT))
}
This returns a list with four components. The first component is a set of coordinates in V for varT, the second is a set of coordinates for varBR. And so forth.
Then, in the REML function, I am having a block like this:
for (i in 1:nrow(VC$xy.varT))
V[VC$xy.varT[i,1], VC$xy.varT[i,2]] = Pars[1]
for (i in 1:nrow(VC$xy.varBR))
if (VC$xy.varBR[i,1] != VC$xy.varBR[i,2])
V[VC$xy.varBR[i,1], VC$xy.varBR[i,2]] = Pars[2]
for (i in 1:nrow(VC$xy.varWR))
V[VC$xy.varWR[i,1], VC$xy.varWR[i,2]] = Pars[3]+Pars[2]
for (i in 1:nrow(VC$xy.varBRT))
V[VC$xy.varBRT[i,1], VC$xy.varBRT[i,2]] = Pars[4]
And now are simply talking REAL BUSINESS
The optimiser is returning blazingly fast with an REML estimate and corresponding variance components. Depending on the tolerance required it is almost instantaneous (<0.5 secs) on my laptop and that machine isn't a high tech equipment at all. The speed with "big V" far outruns the profile approach.
I believe this implies that matrix inversion is done in a clever fashion in R's solve function; as you pointed out PharmCat inversion can be slow (and unstable) as matrices get large, but perhaps the LAPACK routine used by R reduces the task intelligently, possibly through some of the same tricks described for e.g. the Alglib routines..
However, I am also seeing a different little thing:
If I optimise with the profile approach I can make BFGS converge, but this will not work if I try it it in the "big V" approach. Lindstrom and Bates noticed something similar in the sense that they wrote that Quasi-Newton methods perform better when the profile approach is used:
"We recommend optimizing the profile log- likelihood, since it will usually require fewer iterations, the derivatives are somewhat simpler, and the convergence is more consistent. We have also encountered examples where the NR algorithm failed to converge when optimizing the likelihood including a but was able to optimize the profile likelihood with ease."
So, lots of options and I am not going to be late for supper
—
Pass or fail!
ElMaestro
Pass or fail!
ElMaestro
Complete thread:
- Semireplicated + REML in R ElMaestro 2020-07-10 13:19 [🇷 for BE/BA]
- Semireplicated + REML in R Helmut 2020-07-10 19:03
- Semireplicated + REML in R ElMaestro 2020-07-10 19:24
- Avoid partial replicate designs, pleeeze! Helmut 2020-07-11 11:57
- Avoid partial replicate designs, pleeeze! ElMaestro 2020-07-11 14:43
- Braveheart! Helmut 2020-07-11 15:38
- Who can help? ElMaestro 2020-07-12 12:28
- Who can help? ElMaestro 2020-07-12 13:26
- Update II ElMaestro 2020-07-12 21:36
- Update III ElMaestro 2020-07-12 21:46
- Final update today ElMaestro 2020-07-12 22:27
- Medium rare. Helmut 2020-07-13 13:52
- took just 52 hrs to do it :-) ElMaestro 2020-07-13 14:18
- Will take much more hours still… Helmut 2020-07-13 15:34
- Negative determinant ElMaestro 2020-07-14 03:22
- Are we loosers? Helmut 2020-07-14 13:58
- "we"? Loosers? ElMaestro 2020-07-14 15:07
- Misunderstanding? Helmut 2020-07-14 15:32
- "we"? Loosers? ElMaestro 2020-07-14 15:07
- Are we loosers? Helmut 2020-07-14 13:58
- took just 52 hrs to do it :-) ElMaestro 2020-07-13 14:18
- Medium rare. Helmut 2020-07-13 13:52
- Braveheart! ElMaestro 2020-07-13 10:13
- Braveheart! Helmut 2020-07-13 14:16
- Braveheart! PharmCat 2020-07-15 14:19
- Braveheart! ElMaestro 2020-08-02 17:39
- Who can help? ElMaestro 2020-07-12 12:28
- Braveheart! Helmut 2020-07-11 15:38
- Avoid partial replicate designs, pleeeze! ElMaestro 2020-07-11 14:43
- Avoid partial replicate designs, pleeeze! Helmut 2020-07-11 11:57
- Semireplicated + REML in R ElMaestro 2020-07-10 19:24
- We were all blind (except Detlew) Helmut 2020-07-15 14:27
- It is the opposite way around for me ElMaestro 2020-07-15 16:25
- Desultory thoughts Helmut 2020-07-15 17:33
- FDA RSABE is ISC d_labes 2020-07-15 18:13
- FDA RSABE is ISC Helmut 2020-07-16 11:11
- Desultory thoughts ElMaestro 2020-07-15 23:06
- Desultory thoughts Helmut 2020-07-16 10:59
- FDA RSABE is ISC d_labes 2020-07-15 18:13
- Desultory thoughts Helmut 2020-07-15 17:33
- Phoenix - which template? mittyri 2020-07-19 00:42
- FDA RSABE Project template_ v1.4.phxproj Helmut 2020-07-19 01:45
- It is the opposite way around for me ElMaestro 2020-07-15 16:25
- "By popular demand": likelihood ElMaestro 2020-07-24 10:07
- And by the way.... ElMaestro 2020-07-24 12:52
- And by the way.... PharmCat 2020-08-03 14:24
- Not understood ElMaestro 2020-08-03 22:55
- Not understood PharmCat 2020-08-05 01:41
- Not understood ElMaestro 2020-08-05 08:13
- Not understood PharmCat 2020-08-05 16:37
- Open issues ElMaestro 2020-08-06 21:11
- Open issues PharmCat 2020-08-07 00:02
- Open issues ElMaestro 2020-08-07 07:49
- Open issues PharmCat 2020-08-07 11:29
- Still can't make it work ElMaestro 2020-08-07 13:08
- Still can't make it work PharmCat 2020-08-07 15:42
- Still can't make it work ElMaestro 2020-08-07 16:44
- Still can't make it work PharmCat 2020-08-07 18:14
- Still can't make it work ElMaestro 2020-08-07 18:23
- And now it works ElMaestro 2020-08-07 21:31
- Still can't make it work PharmCat 2020-08-08 01:08
- Speed improvement ElMaestro 2020-08-08 12:25
- Speed improvement PharmCat 2020-08-08 17:27
- Speed improvementElMaestro 2020-08-08 18:10
- Speed improvement PharmCat 2020-08-09 18:22
- Some tests... PharmCat 2020-08-10 11:48
- Speed improvement ElMaestro 2020-08-08 12:25
- Still can't make it work ElMaestro 2020-08-07 18:23
- Still can't make it work PharmCat 2020-08-07 18:14
- Still can't make it work ElMaestro 2020-08-07 16:44
- Still can't make it work PharmCat 2020-08-07 15:42
- Still can't make it work ElMaestro 2020-08-07 13:08
- Open issues PharmCat 2020-08-07 11:29
- Open issues ElMaestro 2020-08-07 07:49
- Open issues PharmCat 2020-08-07 00:02
- Open issues ElMaestro 2020-08-06 21:11
- Not understood PharmCat 2020-08-05 16:37
- Not understood ElMaestro 2020-08-05 08:13
- Not understood PharmCat 2020-08-05 01:41
- Not understood ElMaestro 2020-08-03 22:55
- And by the way.... PharmCat 2020-08-03 14:24
- And by the way.... ElMaestro 2020-07-24 12:52
- Semireplicated + REML in R Helmut 2020-07-10 19:03