Bioequivalence and Bioavailability Forum

ElMaestro
★★★

Denmark,
2020-07-10 15:19
(1787 d 23:12 ago)

Posting: # 21669
Views: 32,028

Semireplicated + REML in R [🇷 for BE/BA]

Hi all,

I am writing a little optimizer based on REML in R for use with semi-replicated designs. I am starting from the bottom and working upwards, since the various packages do not really help when R is replicated and T is not.

Does anyone have a dataset with well-established REML reference values for the variance components for the TRR/RRT/RTR design? A little more than the CVref of EMA dataset II would be appreciated :-)

Many thanks.

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-10 21:03
(1787 d 17:28 ago)

@ ElMaestro
Posting: # 21670
Views: 30,405

Semireplicated + REML in R

Post reply

Hi ElMaestro,

❝ I am writing a little optimizer based on REML in R for use with semi-replicated designs. I am starting from the bottom and working upwards, since the various packages do not really help when R is replicated and T is not.

Wow! Nobody mastered that in R so far. I don’t want to dampen your enthusiasm but I think that is not possible at all.

❝ Does anyone have a dataset with well-established REML reference values for the variance components for the TRR/RRT/RTR design? A little more than the CVref of EMA dataset II would be appreciated :-)

Here you are (for DS II, of course). Results in Phoenix for the FDA’s covariance structure „Banded No-Diagonal Factor Analytic (f=2)” which is in SAS-lingo FA0(2). After 7 iterations (-2REML -48.2, AIC -26.2):

lambda(1,1) 0.19030501

lambda(1,2) 0.23981134

lambda(2,2) 0.074350441

s2wR        0.013246493

s2wT        0.0074570998

G Matrix*

0.03621600 0.04563730

0.04563730 0.06303747

As usual:

WARNING: Newton's algorithm converged with modified Hessian. Output is suspect.

Model may be over-specified. A simpler model could be tried.

s2wT is nonsense, since T is not replicated and the model is over-specified.
The lambdas are called in SAS-lingo FA(x,y). After 7 iterations (-2REML -30.7, AIC -20.7):

Cov Parm     Subject    Group    Estimate

FA(1,1)      Subj                  0.1903

FA(2,1)      Subj                  0.2398

FA(2,2)      Subj                  0.1072

Residual     Subj       trt R      0.01325

Residual     Subj       trt T      0.001497



    Estimated G Matrix

Row  trt    Col1      Col2

 1    R   0.03622   0.04564

 2    T   0.04564   0.06900

Also:

NOTE: Convergence criteria met but final hessian is not positive definite.

NOTE: Asymptotic variance matrix of covariance parameter estimates has been found to be 

      singular and a generalized inverse was used. Covariance parameters with zero variance do 

      not contribute to degrees of freedom computed by DDFM=SATTERTH.

No problems with convergence with f=1 (or FA0(1) in SAS). FA0(1) helps in general but not always.

lambda(1,1) 0.19030501

lambda(1,2) 0.23981134

s2wR        0.013246493

s2wT        0.012985088

Different s2wT but crap as well.

[image]

You can get anything you like. The same data set throwing warnings in one software but not the other; same with convergence. In Certara’s forum a user posted a data set which failed to converge both in SAS and Phoenix, even with FA0(1). Then one is in deep shit.

Even if a data set converges without warnings, you might get different estimates (except the first one and s2wR) in SAS and Phoenix. Not surprising cause the optimizer walks around in parameter-space trying to find the maximum of a surface which is essentially flat. Increasing the number of iterations and/or decreasing the tolerance rarely helps.

I still wait for someone explaining why he/she is using such a wacky design. Beyond me.

The G (variance-covariance) matrix is obtained by
$\small{\begin{matrix}
s_\textrm{bT}^2 & \textrm{cov(bT,bR)}\\
\textrm{cov(bT,bR)} & s_\textrm{bR}^2
\end{matrix}}$,
where $\small{s_\textrm{bT}^2}$ and $\small{s_\textrm{bR}^2}$ are the between subject variances of T and R, respectively and $\small{\textrm{cov(bT,bR)}}$ is the between subject covariance for T and R given by $\small{s_\textrm{bT}\times s_\textrm{bR}}$. The subject-by-formulation interaction variance is given by $\small{s_\textrm{bT}^2+s_\textrm{bR}^2-2\times\textrm{cov(bT,bR)}}.$
Whereas one might get a negative lambda(2,2) in PHX, in such a case SAS forces FA(2,2) to zero. With both approaches one might end up with a negative subject-by-formulation interaction variance…
The same holds for the within subject variance of T (can be negative in PHX but is forced to zero in SAS).
However, in partial replicate designs $s_\textrm{bT}^2$ and $s_\textrm{wT}^2$ cannot be unambiguously estimated, only their sum. Even then, PHX and SAS don’t agree (0.04367 vs. 0.03771).

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro ★★★ Denmark, 2020-07-10 21:24 (1787 d 17:07 ago) @ Helmut Posting: # 21671 Views: 30,260	Semireplicated + REML in R Post reply
	Hi Hötzi, ❝ Wow! Nobody mastered that in R so far. I don’t want to dampen your enthusiasm but I think that is not possible at all. Challenge accepted Thanks for the results — Pass or fail! ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-11 13:57
(1787 d 00:34 ago)

@ ElMaestro
Posting: # 21672
Views: 30,049

Avoid partial replicate designs, pleeeze!

Post reply

Hi ElMaestro,

❝ ❝ Wow! Nobody mastered that in R so far. I don’t want to dampen your enthusiasm but I think that is not possible at all.

❝

❝ Challenge accepted :-)

Good luck! I added the variance-covariance matrices and the complete warnings above.

From this SASians’ document:

If you experience convergence problems, consider these points:
- Try different TYPE= covariance structures that are supported by the data or expert knowledge.
- Verify that the model is not misspecified or overspecified. A misspecified or overspecified model is one of the common causes of convergence issues. Always check your model specifications.

Further down about

NOTE: Convergence criteria met but final hessian is not positive definite.
The reasons for why this note occurs can vary. The most common reasons for this note relates to scaling issues or misspecified or overspecified models.
A nonpositive definite Hessian matrix can indicate a surface saddlepoint […]. If the MIXED procedure has converged to a saddlepoint, then the final solution given by PROC MIXED is not the optimal solution. To get around this problem, run the model again using different starting values. Try adding a PARMS statement to your PROC MIXED code. Either use the OLS option to specify ordinary least squares starting values (versus the default MIVQUE0 values), or specify your own starting values in the PARMS statement. The syntax for the PARMS statement with the OLS option is simple, as shown below:
parms / ols;

In my understanding:

Trying to estimate within subject variances separate for both treatments in the common semireplicate design (TRR|RTR|RRT) or – even worse – the extra-reference design (TRR|RTR) is futile due to the overspecified model. Hence, TYPE=FA0(2) and its relatives in other software is always wrong and should not be used.
The default starting values starting values PROC MIXED are obtained by MIVQUE0, which are “similar to ANOVA estimates” (whatever that means). Phoenix uses the Method of Moments though you can specify starting values as well (like in SAS).
Since there are data sets in the wild where nothing helped (neither in SAS nor in PHX) to achieve convergence, I can only recommend to avoid partial replicate designs at all.
Otherwise, one may have performed a study with no means to evaluate it.

You can’t fix by analysis
what you bungled by design.   Richard J. Light, Judith D. Singer, John B. Willett

100% of all disasters are failures of design, not analysis.   Ronald G. Marks

To propose that poor design
can be corrected by subtle analysis techniques
is contrary to good scientific thinking.   Stuart J. Pocock

[…] an inappropriate study design is incapable of answering
a research question, no matter how careful the subsequent
methodology, conduct, analysis, and interpretation:
Flawless execution of a flawed design achieves nothing worthwhile.   J. Rick Turner

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-11 16:43
(1786 d 21:48 ago)

@ Helmut
Posting: # 21673
Views: 29,744

Avoid partial replicate designs, pleeeze!

Post reply

Hi again,

it is rather experimental.
I do not have a level of understanding that allows me to use V=ZGZ^t+R,
so I will be constructing V directly from the data listing without the intermediary Z and G. I do not wish to get bogged down by the various aspects of covariance matrices; if CSH does not pan out well in SAS then it is an implenetation or convergence issue only, it is not because CSH is irrelevant. CSH is for all practical purposes the covariance matrix that makes sense, which means we have:
A variance for T (combined within and between)
A variance for within-R
A variance for between-R

Hessian, positive definite, etc.. all this comes down to the details of SAS's or WinNonlin's optimiser. I am playing around with optim in R (BFGS, L-BFGS-B bounded, and Nelder-Mead) and with one of my own making.

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-11 17:38
(1786 d 20:53 ago)

@ ElMaestro
Posting: # 21674
Views: 29,763

Braveheart!

Post reply

Hi ElMaestro,

❝ it is rather experimental.

❝ […] which means we have:

❝ A variance for T (combined within and between)

❝ A variance for within-R

❝ A variance for between-R

Correct. Will be interesting what you get because $\sigma_{wT}^2+\sigma_{bT}^2$:
0.04367 (Phoenix) and 0.03771 (SAS).

❝ Hessian, positive definite, etc.. [image] all this comes down to the details of SAS's or WinNonlin's optimiser. I am playing around with optim in R (BFGS, L-BFGS-B bounded, and Nelder-Mead) and with one of my own making.

If you succeed, the community will love you. Might win you the Fields medal.

As long as a genius (you?) is not able to come up with a solution working in all cases, I can only repeat:

Avoid partial replicate designs!

It might not be possible to evaluate the study for ABE (though RSABE works in any case).

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-12 14:28
(1786 d 00:03 ago)

@ Helmut
Posting: # 21675
Views: 29,763

Who can help?

Post reply

Hi all,

some help needed. :-D

Specifically, I am asking for help to make code work and not to make it smarter (achieving the same with less key strokes, using built-in alternatives etc) although there is generally a preferred focus on the latter here :-)

Hopefully we will discuss later the need for additional or other variance components once we have made the optimiser work somehow (or while we are making it work, if such vc's are totally necessary). There are probably a ton of errors to iron out. There is at least one big one whose effects I a suffering :-D

The idea is to construct the model from the bottoms up.

First I do a full-rank design matrix X for the fixed effects, without subjects
Next I do the covariance matrix with just three variance components as discussed above.
I generate some cheap starter guesses for the variance components
I define an objective function for "a constant + log likelihood"
I optimise the variance components, in REML the effect estimates will come along with them.

I am so far facing a singularity with the apparent log likelihood, so obviously something is terribly wrong. I include a ref to the function where I got inspiration for the log likelihood objective function.
Another site for the objective function is: https://www.stt.msu.edu/users/pszhong/Lecture_16_Spring_2017.pdf (page 13+14), but this also did not quite get me to convergence.

So, this is what I have so far. I don't understand matrix algebra totally well, and I am at a loss of you say Hermitian, expectation-minimisation, Jacobian, etc.

What bothers me is that some years ago I wrote a working optimiser in C on basis of slides some some Danish guy called Waagepetersen, who walked the reader slowly through the process of optimising REML. I can't find my code now, it was on another computer, and I can't find his slides any more (there are some slides with his name out there, which I can find with google etc but they do not have that old recipe).

rm(list=ls(all=TRUE))



###### par vector is c(varT, varBR, varWR)



######

######   Section 1: Household things

######







CreateX=function(D)

{

 ##first the two treatments

 TrtT=as.numeric(as.character(D$Trt)=="T")

 TrtR=as.numeric(as.character(D$Trt)=="R")

 X=cbind(TrtT, TrtR)

 R= qr(X)$rank



 ## subjects: in a mixed model with subj as random

 ## we do not do subjects also as fixed, therefore they are #'ed away here

 ## for (s in unique(D$Subj))

 ## {

 ## v=as.numeric(D$Subj==s)

 ## #print(v)

 ## XX=data.frame(X, v)

 ## names(XX)[ncol(XX)] = paste("S", s, sep="")

 ## rnk=qr(XX)$rank

 ## if (rnk>R) 

  ##  {

  ##     X=XX

  ##     R=rnk

  ##  }

  ## }

 ##now the Pers

 for (p in unique(D$Per))

 {

  v=as.numeric(D$Per==p)

  #print(v)

  XX=data.frame(X, v)

  names(XX)[ncol(XX)] = paste("P", p, sep="")

  rnk=qr(XX)$rank

  if (rnk>R) 

    {

       X=XX

       R=rnk

    }

 } 



 for (q in unique(D$Seq))

 {

  v=as.numeric(D$Seq==q)

  #print(v)

  XX=data.frame(X, v)

  names(XX)[ncol(XX)] = paste("Q", q, sep="")

  rnk=qr(XX)$rank

  if (rnk>R) 

    {

       X=XX

       R=rnk

    }

 } 

 return(as.matrix(X))

}









Create.CovM=function(Params)

##block diagonal covariance matrix

{

  varT=Params[1]


  varBR=Params[2]


  varWR=Params[3]


  #varRT=Params[4]




  #cat("Vars:", varT, varBR, varWR,"\n")



  Nobs=length(D$Y)

  V=matrix(0,ncol=Nobs, nrow=Nobs)

  for (iRow in 1:Nobs)

  for (iCol in 1:Nobs)

  {

   

   ## the diagonal

   if (iCol==iRow)

    {

      if (D$Trt[iRow]=="T") V[iRow,iCol]=V[iRow,iCol]+varT

      if (D$Trt[iRow]=="R") V[iRow,iCol]=V[iRow,iCol]+varWR 

    }



   ## off diagonal

   if (iCol!=iRow)

   if (D$Subj[iRow]==D$Subj[iCol])

    {

     if (D$Trt[iCol]==D$Trt[iRow]) V[iRow,iCol]= V[iRow,iCol]+varBR

     #if (D$Trt[iCol]!=D$Trt[iRow]) V[iRow,iCol]= V[iRow,iCol]+varRT

    }

  }

  return(as.matrix(V))

}



######

######   Section 2: Matrix things

######





Obj.F12=function(Pars)

##this is line 3 of page 10 of:

##http://people.csail.mit.edu/xiuming/docs/tutorials/reml.pdf

{

  CovM=Create.CovM(Pars)



  A= -0.5*log(det(CovM))

  B= -0.5*log(det(t(X) %*% solve(CovM) %*% X))

  est.b = solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y

  tmp= y - X %*% est.b

  C=-0.5 *(t(tmp) %*% solve(CovM) %*% tmp)

  return(A+B+C)

}







Some.Initial.Guesses=function(foo)

{

  #guess variance within R

  D1=subset(D, D$Trt=="R")

  m=lm(log(Y)~factor(Subj)+factor(Seq)+factor(Per) , data=D1)

  #print(anova(m))

  varWR= anova(m)["Residuals", "Mean Sq"]


  m=lm(log(Y)~factor(Subj)+factor(Seq)+factor(Per)+factor(Trt) , data=D)

  varBR= anova(m)["factor(Subj)", "Mean Sq"]


  ##total var T, cheaply

  ##actually, this may be much better: varT=varBR+varWR

  D2=subset(D, D$Trt=="T")

  varT=var(log(D2$Y))

  varT=varBR+varWR

  L=list(varT=varT, varBR=varBR, varWR=varWR)

  return(L)

}







####

#### section 3: execution

####







D=read.csv("EMAII.a.csv" , header=T, sep="\t") ##public = easier

X=CreateX(D) ##public = easier

y=log(D$Y) ##public = easier





Check.Surf=function(i, x1,x2, N)

## Checks the log likelihood surface of the i'th variance component

## (given the other variance components) for the interval var = x1 to x2

## with a granularity of N points.

{

  L=Some.Initial.Guesses(0)

  parIni=c(L$varT, L$varBR, L$varWR )

  x=x1;

  vX=NULL

  vY=NULL

  dx=(x2-x1)/N

  for (j in 1:N)

  {

    vX=c(vX, x)

    parIni[i]=x

    vY=c(vY, Obj.F12(parIni))

    x=x+dx

  } 

  plot(vX, vY)

}



MyREML=function(foo.bar)

## ideally this would work, but it doesn't right now 

## also, I have a feeling the surface may be quite rugged,

## so I may switch to a homecooked optimiser

## once I have ironed out the singularity issue, see below.

{

  L=Some.Initial.Guesses(0)

  print(L)

  parIni=c(L$varT, L$varBR, L$varWR)

  F=optim(par=parIni, fn=Obj.F12, #method="L-BFGS-B",lower=c(0,0,0),

      control=list(reltol=1e-10, trace=T))

  print(F)

}

# if ti worked it'd be called through

# MyREML(0)



# however:



Check.Surf(1, 0.01, 0.2, 50) 

  ## ouch, that's a singularity.

  ## so something is wrong in the covariance matrix 

  ## or in the "const + likelihood function"                     

  ## or elsewhere. Any ideas?



#

#

#

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2020-07-12 15:26
(1785 d 23:05 ago)

@ ElMaestro
Posting: # 21676
Views: 29,605

Who can help?

Post reply

Important update, I forgot vBR on the diagonal. Now the singuality is less of a heacahe:


rm(list=ls(all=TRUE))



###### par vector is c(varT, varBR, varWR)



######

######   Section 1: Household things

######







CreateX=function(D)

{

 ##first the two treatments

 TrtT=as.numeric(as.character(D$Trt)=="T")

 TrtR=as.numeric(as.character(D$Trt)=="R")

 X=cbind(TrtT, TrtR)

 R= qr(X)$rank



 ## subjects: in a mixed model with subj as random

 ## we do not do subjects also as fixed, therefore they are #'ed away here

 ## for (s in unique(D$Subj))

 ## {

 ## v=as.numeric(D$Subj==s)

 ## #print(v)

 ## XX=data.frame(X, v)

 ## names(XX)[ncol(XX)] = paste("S", s, sep="")

 ## rnk=qr(XX)$rank

 ## if (rnk>R) 

  ##  {

  ##     X=XX

  ##     R=rnk

  ##  }

  ## }

 ##now the Pers

 for (p in unique(D$Per))

 {

  v=as.numeric(D$Per==p)

  #print(v)

  XX=data.frame(X, v)

  names(XX)[ncol(XX)] = paste("P", p, sep="")

  rnk=qr(XX)$rank

  if (rnk>R) 

    {

       X=XX

       R=rnk

    }

 } 



 for (q in unique(D$Seq))

 {

  v=as.numeric(D$Seq==q)

  #print(v)

  XX=data.frame(X, v)

  names(XX)[ncol(XX)] = paste("Q", q, sep="")

  rnk=qr(XX)$rank

  if (rnk>R) 

    {

       X=XX

       R=rnk

    }

 } 

 return(as.matrix(X))

}









Create.CovM=function(Params)

##block diagonal covariance matrix

{

  varT=Params[1]


  varBR=Params[2]


  varWR=Params[3]


  #varRT=Params[4]




  #cat("Vars:", varT, varBR, varWR,"\n")



  Nobs=length(D$Y)

  V=matrix(0,ncol=Nobs, nrow=Nobs)

  for (iRow in 1:Nobs)

  for (iCol in 1:Nobs)

  {

   

   ## the diagonal

   if (iCol==iRow)

    {

      if (D$Trt[iRow]=="T") V[iRow,iCol]=V[iRow,iCol]+varT

      if (D$Trt[iRow]=="R") V[iRow,iCol]=V[iRow,iCol]+varWR +varBR

    }



   ## off diagonal

   if (iCol!=iRow)

   if (D$Subj[iRow]==D$Subj[iCol])

    {

     if (D$Trt[iCol]==D$Trt[iRow]) V[iRow,iCol]= V[iRow,iCol]+varBR

     #if (D$Trt[iCol]!=D$Trt[iRow]) V[iRow,iCol]= V[iRow,iCol]+varRT

    }

  }

  return(as.matrix(V))

}



######

######   Section 2: Matrix things

######



Obj.F12=function(Pars)

##this is line 3 of page 10 of:

##http://people.csail.mit.edu/xiuming/docs/tutorials/reml.pdf

{

  CovM=Create.CovM(Pars)



  A= -0.5*log(det(CovM))

  B= -0.5*log(det(t(X) %*% solve(CovM) %*% X))

  est.b = solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y

  tmp= y - X %*% est.b

  C=-0.5 *(t(tmp) %*% solve(CovM) %*% tmp)

  return(A+B+C)

}







Some.Initial.Guesses=function(foo)

{

  #guess variance within R

  D1=subset(D, D$Trt=="R")

  m=lm(log(Y)~factor(Subj)+factor(Seq)+factor(Per) , data=D1)

  #print(anova(m))

  varWR= anova(m)["Residuals", "Mean Sq"]


  m=lm(log(Y)~factor(Subj)+factor(Seq)+factor(Per)+factor(Trt) , data=D)

  varBR= anova(m)["factor(Subj)", "Mean Sq"]


  ##total var T, cheaply

  ##actually, this may be much better: varT=varBR+varWR

  D2=subset(D, D$Trt=="T")

  varT=var(log(D2$Y))

  varT=varBR+varWR

  L=list(varT=varT, varBR=varBR, varWR=varWR)

  return(L)

}







####

#### section 3: execution

####







D=read.csv("EMAII.a.csv" , header=T, sep="\t") ##public = easier

X=CreateX(D) ##public = easier

y=log(D$Y) ##public = easier





Check.Surf=function(i, x1,x2, N)

## Checks the log likelihood surface of the i'th variance component

## (given the other variance components) for the interval var = x1 to x2

## with a granularity of N points.

{

  L=Some.Initial.Guesses(0)

  parIni=c(L$varT, L$varBR, L$varWR )

  x=x1;

  vX=NULL

  vY=NULL

  dx=(x2-x1)/N

  for (j in 1:N)

  {

    vX=c(vX, x)

    parIni[i]=x

    vY=c(vY, Obj.F12(parIni))

    x=x+dx

  } 

  plot(vX, vY)

}



MyREML=function(foo.bar)

## ideally this would work, but it doesn't right now 

## also, I have a feeling the surface may be quite rugged,

## so I may switch to a homecooked optimiser

## once I have ironed out the singularity issue, see below.

{

  L=Some.Initial.Guesses(0)

  print(L)

  parIni=c(L$varT, L$varBR, L$varWR)

  F=optim(par=parIni, fn=Obj.F12, #method="L-BFGS-B",lower=c(0,0,0),

      control=list(reltol=1e-10, trace=T))

  print(F)

}

# if ti worked it'd be called through

 MyREML(0)



# however:



Check.Surf(3, 0.01, 0.2, 50) ##checks the surface in one dimension





#

#

#

—
Pass or fail!
ElMaestro

ElMaestro ★★★ Denmark, 2020-07-12 23:36 (1785 d 14:55 ago) @ ElMaestro Posting: # 21677 Views: 29,515	Update II Post reply
	After re-re-re-re-reading the documentation for `optim`, I think it can be forced to look for either a minimum (default) or for a maximum if need be through the `control` argument or, more simply, the objective function should be handled for a minimum by putting a minus sign on the return of `Obj.F12`... ... but even so the estimates are still somewhat off. — Pass or fail! ElMaestro

ElMaestro
★★★

Denmark,
2020-07-12 23:46
(1785 d 14:45 ago)

@ ElMaestro
Posting: # 21678
Views: 29,454

Update III

Post reply

Look at this baby:


rm(list=ls(all=TRUE))



###### par vector is c(varT, varBR, varWR)



######

######   Section 1: Household things

######







CreateX=function(D)

{

 ##first the two treatments

 TrtT=as.numeric(as.character(D$Trt)=="T")

 TrtR=as.numeric(as.character(D$Trt)=="R")

 X=cbind(TrtT, TrtR)

 R= qr(X)$rank



 ## subjects: in a mixed model with subj as random

 ## we do not do subjects also as fixed, therefore they are #'ed away here

 ## for (s in unique(D$Subj))

 ## {

 ## v=as.numeric(D$Subj==s)

 ## #print(v)

 ## XX=data.frame(X, v)

 ## names(XX)[ncol(XX)] = paste("S", s, sep="")

 ## rnk=qr(XX)$rank

 ## if (rnk>R) 

  ##  {

  ##     X=XX

  ##     R=rnk

  ##  }

  ## }

 ##now the Pers

 for (p in unique(D$Per))

 {

  v=as.numeric(D$Per==p)

  #print(v)

  XX=data.frame(X, v)

  names(XX)[ncol(XX)] = paste("P", p, sep="")

  rnk=qr(XX)$rank

  if (rnk>R) 

    {

       X=XX

       R=rnk

    }

 } 



 for (q in unique(D$Seq))

 {

  v=as.numeric(D$Seq==q)

  #print(v)

  XX=data.frame(X, v)

  names(XX)[ncol(XX)] = paste("Q", q, sep="")

  rnk=qr(XX)$rank

  if (rnk>R) 

    {

       X=XX

       R=rnk

    }

 } 

 return(as.matrix(X))

}









Create.CovM=function(Params)

##block diagonal covariance matrix

{

  varT=Params[1]


  varBR=Params[2]


  varWR=Params[3]


  #varRT=Params[4]


  #cat("Vars:", varT, varBR, varWR,"\n")



  Nobs=length(D$Y)

  V=matrix(0,ncol=Nobs, nrow=Nobs)

  for (iRow in 1:Nobs)

  for (iCol in 1:Nobs)

  {

   

   ## the diagonal

   if (iCol==iRow)

    {

      if (D$Trt[iRow]=="T") V[iRow,iCol]=V[iRow,iCol]+varT

      if (D$Trt[iRow]=="R") V[iRow,iCol]=V[iRow,iCol]+varWR +varBR

    }



   ## off diagonal

   if (iCol!=iRow)

   if (D$Subj[iRow]==D$Subj[iCol])

    {

     if (D$Trt[iCol]==D$Trt[iRow]) V[iRow,iCol]= V[iRow,iCol]+varBR

     #if (D$Trt[iCol]!=D$Trt[iRow]) V[iRow,iCol]= V[iRow,iCol]+varRT

    }

  }

  return(as.matrix(V))

}



######

######   Section 2: Matrix things

######





Obj.F12=function(Pars)

##this is line 3 of page 10 of:

##http://people.csail.mit.edu/xiuming/docs/tutorials/reml.pdf

{

  CovM=Create.CovM(Pars)



  A= -0.5*log(det(CovM))

  B= -0.5*log(det(t(X) %*% solve(CovM) %*% X))

  est.b = solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y

  tmp= y - X %*% est.b

  C=-0.5 *(t(tmp) %*% solve(CovM) %*% tmp)

  return(A+B+C)

}







Some.Initial.Guesses=function(foo)

{

  #guess variance within R

  D1=subset(D, D$Trt=="R")

  m=lm(log(Y)~factor(Subj)+factor(Seq)+factor(Per) , data=D1)

  #print(anova(m))

  varWR= anova(m)["Residuals", "Mean Sq"]


  m=lm(log(Y)~factor(Subj)+factor(Seq)+factor(Per)+factor(Trt) , data=D)

  varBR= anova(m)["factor(Subj)", "Mean Sq"]


  ##total var T, cheaply

  ##actually, this may be much better: varT=varBR+varWR

  D2=subset(D, D$Trt=="T")

  varT=var(log(D2$Y))

  varT=varBR+varWR

  L=list(varT=varT, varBR=varBR, varWR=varWR)

  return(L)

}







####

#### section 3: execution

####







D=read.csv("EMAII.a.csv" , header=T, sep="\t") ##public = easier

X=CreateX(D) ##public = easier

y=log(D$Y) ##public = easier





MyREML=function(foo.bar)

{

  L=Some.Initial.Guesses(0)

  print(L)

  parIni=c(0.06, 0.032, 0.012)

  F=optim(par=parIni, fn=Obj.F12, 

      #method="BFGS",

      #lower=c(0.003,0.003,0.003), upper=c(1,1,1),

      control=list(reltol=1e-11, trace=T, fnscale=-1))

  print(F)

}

MyREML(0)

On my machine it gives:

$par

[1] 0.06776119 0.03636979 0.01237020

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2020-07-13 00:27
(1785 d 14:04 ago)

@ ElMaestro
Posting: # 21679
Views: 29,389

Final update today

Post reply




rm(list=ls(all=TRUE))

 



######

######   Section 1: Household things

######







CreateX=function(D)

{

 ##first the two treatments

 TrtT=as.numeric(as.character(D$Trt)=="T")

 TrtR=as.numeric(as.character(D$Trt)=="R")

 X=cbind(TrtT, TrtR)

 R= qr(X)$rank



 ## subjects: in a mixed model with subj as random

 ## we do not do subjects also as fixed, therefore they are #'ed away here

 ## for (s in unique(D$Subj))

 ## {

 ## v=as.numeric(D$Subj==s)

 ## #print(v)

 ## XX=data.frame(X, v)

 ## names(XX)[ncol(XX)] = paste("S", s, sep="")

 ## rnk=qr(XX)$rank

 ## if (rnk>R) 

  ##  {

  ##     X=XX

  ##     R=rnk

  ##  }

  ## }

 ##now the Pers

 for (p in unique(D$Per))

 {

  v=as.numeric(D$Per==p)

  #print(v)

  XX=data.frame(X, v)

  names(XX)[ncol(XX)] = paste("P", p, sep="")

  rnk=qr(XX)$rank

  if (rnk>R) 

    {

       X=XX

       R=rnk

    }

 } 



 for (q in unique(D$Seq))

 {

  v=as.numeric(D$Seq==q)

  #print(v)

  XX=data.frame(X, v)

  names(XX)[ncol(XX)] = paste("Q", q, sep="")

  rnk=qr(XX)$rank

  if (rnk>R) 

    {

       X=XX

       R=rnk

    }

 } 

 return(as.matrix(X))

}









Create.CovM=function(Params)

##block diagonal covariance matrix

{

  varT=Params[1]


  varBR=Params[2]


  varWR=Params[3]


  varRT=Params[4]


  #cat("Vars:", varT, varBR, varWR,"\n")



  Nobs=length(D$Y)

  V=matrix(0,ncol=Nobs, nrow=Nobs)

  for (iRow in 1:Nobs)

  for (iCol in 1:Nobs)

  {

   

   ## the diagonal

   if (iCol==iRow)

    {

      if (D$Trt[iRow]=="T") V[iRow,iCol]=V[iRow,iCol]+varT

      if (D$Trt[iRow]=="R") V[iRow,iCol]=V[iRow,iCol]+varWR +varBR

    }



   ## off diagonal

   if (iCol!=iRow)

   if (D$Subj[iRow]==D$Subj[iCol])

    {

     if (D$Trt[iCol]==D$Trt[iRow]) V[iRow,iCol]= V[iRow,iCol]+varBR

     if (D$Trt[iCol]!=D$Trt[iRow]) V[iRow,iCol]= V[iRow,iCol]+varRT

    }

  }

  return(as.matrix(V))

}



######

######   Section 2: Matrix things

######



Obj.F12=function(Pars)

##this is line 3 of page 10 of:

##http://people.csail.mit.edu/xiuming/docs/tutorials/reml.pdf

{

  CovM=Create.CovM(Pars)



  A= -0.5*log(det(CovM))

  B= -0.5*log(det(t(X) %*% solve(CovM) %*% X))

  est.b = solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y

  tmp= y - X %*% est.b

  C=-0.5 *(t(tmp) %*% solve(CovM) %*% tmp)

  return(A+B+C)

}













####

#### section 3: execution

####



Some.Initial.Guesses=function(foo)

{

 #guess varT

 D1=subset(D, (D$Trt=="T")) 

 m=lm(log(Y)~factor(Seq)+factor(Per)   , data=D1)

 varT=anova(m)["Residuals", "Mean Sq"]
 #guess VarWR

 D1=subset(D, (D$Trt=="R"))

 m=lm(log(Y)~factor(Seq)+factor(Per)+factor(Subj)  , data=D1)

 varWR=anova(m)["Residuals", "Mean Sq"]
 #guess varBR

 S=unique(D$Subj)

 v=NULL

 for (s in S)

 {

  Dx=subset(D, (D$Subj==s) & (D$Trt=="R"))

  v=c(v, mean(log(Dx$Y)))

 }

 varBR=var(v)-varWR

 varRT=0.5*(varT+varBR)



 rslt=c(varT, varBR, varWR, varRT)

 return(rslt)

}









D=read.csv("EMAII.a.csv" , header=T, sep="\t") ##public = easier

X=CreateX(D) ##public = easier

y=log(D$Y) ##public = easier





MyREML=function(foo.bar)

 

{

  L=Some.Initial.Guesses(0)

  print(L)

  parIni=L 

  F=optim(par=parIni, fn=Obj.F12, 

      control=list(reltol=1e-9, trace=T, fnscale=-1))

  Nms=c("varT","varBR", "varWR", "varRT")

  X=data.frame(Nms, F$par, parIni)

  names(X)=c("Var.Component", "Value", "Ini.value")

  return(X)

}

 

MyREML(0)

which gives


  Var.Component      Value  Ini.value

1          varT 0.07050210 0.06841615

2         varBR 0.03621403 0.02776728

3         varWR 0.01324510 0.01240137

4         varRT 0.04563784 0.04809171

By the way: You can adjust how many decimals you want on e.g. varWR (~s2WR) through the reltol setting. Use e.g. 1e-12 to get 0.0.01324652 with just a few more iterations.

Job done?

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-13 15:52
(1784 d 22:39 ago)

@ ElMaestro
Posting: # 21684
Views: 29,201

Medium rare.

Post reply

Hi ElMaestro,

❝ Job done?

I prefer medium rare over well done (aka quick and dirty or clean and never)!
Below my more R-ish code. Changes:

Reads CSV files in any format (from the web or local system, arbitrary column separator [default comma] and decimal separator [default period]).
Vectorizes the data set.
Checks the design and assigns a variable 'des' ("partial" or "full") for further use. In principle all full and partial replicates are supported.
Get rid of missing geometric means.
In Obj.F12() returns a double instead of a 1:1 matrix. Useful if you decide to switch the optimizer (some require a vector).
Direct access of D instead of subset(). Subset is convenient in development but leads to endless troubles when building a package.
optim(control = list(reltol = 1e-12, trace = 0,...)
Information about optimization.
G-matrix.

TODO:

I wonder why this code works at all.
In Obj.F12() the line
B <- -0.5*log(det(t(X) %*% solve(CovM) %*% X))
is the reason for the warnings because t(X) is not a square matrix. Therefore B is NaN.
Adapt for full replicates (varWT already estimated).
Adapt for the conventional TR|RT – mixed model also required for the FDA? Currently stops execution for such a data set.
Most important: Get the PE (from package lmerTest together with Satterthwaite’s DFs) and calculate the CI.
Implement the FDA’s RSABE (should be easy).

###############################

# Section 1: Household things #

###############################



GetData <- function(path, sep, dec) {

  # read data from a CSV-file

  if (missing(sep)) sep <- ","

  if (!sep %in% c(",", ";", "t"))

    stop("Column separator must be any of ',', ';', 't'.")

  if (sep == "t") sep <- "\t"

  if (missing(dec)) dec <- "."

  if (!dec %in% c(".", ","))

    stop("Decimal separator must be '.', or ','.")

  # strip quotes and skip eventual commentary lines

  D        <- read.csv(path, sep = sep, dec = dec,

                       quote = "", comment.char = "#")

  names(D) <- c("Subj", "Per", "Seq", "Trt", "Y")

  cols     <- c("Subj", "Per", "Seq", "Trt")

  D[cols]  <- lapply(D[cols], factor)

  return(invisible(D))

}



CreateX <- function(D) {

  # first the two treatments

  TrtT <- as.numeric(as.character(D$Trt) == "T")

  TrtR <- as.numeric(as.character(D$Trt) == "R")

  X <- cbind(TrtT, TrtR)

  R <- qr(X)$rank

  # now the periods

  for (p in unique(D$Per)) {

    v  <- as.numeric(D$Per == p)

    XX <- data.frame(X, v)

    names(XX)[ncol(XX)] <- paste0("P", p)

    rnk <- qr(XX)$rank

    if (rnk > R) {

      X <- XX

      R <- rnk

    }

  }

  for (q in unique(D$Seq)) {

    v  <- as.numeric(D$Seq == q)

    XX <- data.frame(X, v)

    names(XX)[ncol(XX)] <- paste0("Q", q)

    rnk <- qr(XX)$rank

    if (rnk > R) {

      X <- XX

      R <- rnk

    }

  }

  return(as.matrix(X))

}

Create.CovM <- function(Params) {

  # block diagonal covariance matrix

  varT  <- Params[1]


  varBR <- Params[2]


  varWR <- Params[3]


  varRT <- Params[4]


  Nobs  <- length(D$Y)

  V     <- matrix(0, ncol = Nobs, nrow = Nobs)

  for (iRow in 1:Nobs) {

    for (iCol in 1:Nobs) {

      if (iCol == iRow) { # diagonal elements

        if (D$Trt[iRow] == "T") {

          V[iRow, iCol] <- V[iRow, iCol] + varT

        } else {

          V[iRow, iCol] <- V[iRow, iCol] + varWR + varBR

        }

      } else {            # off diagonal elements

        if (D$Subj[iRow] == D$Subj[iCol]) {

          if (D$Trt[iCol] == D$Trt[iRow]) {

            V[iRow, iCol] <- V[iRow, iCol] + varBR

          } else {

            V[iRow, iCol] <- V[iRow, iCol] + varRT

          }

        }

      }

    }

  }

  return(V)

}



############################

# Section 2: Matrix things #

############################

Obj.F12 <- function(Pars) {

  # this is line 3 of page 10 of:

  # http://people.csail.mit.edu/xiuming/docs/tutorials/reml.pdf

  CovM  <- Create.CovM(Pars)

  A     <- -0.5*log(det(CovM))

  # TODO: NaNs produced in the next line cause t(X) is not a square matrix

  B     <- -0.5*log(det(t(X) %*% solve(CovM) %*% X))

  est.b <- solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y

  tmp   <- y - X %*% est.b

  C     <- -0.5 *(t(tmp) %*% solve(CovM) %*% tmp)

  C     <- as.numeric(C) # make a scalar for other solvers

  return(A+B+C)

}



########################

# Section 3: Execution #

########################

Some.Initial.Guesses <- function() {

  # design (all possible ones, error handling, full or partial)

  subjs <- unique(as.numeric(D$Subj))

  trts  <- sort(unique(as.character(D$Trt)), decreasing = TRUE)

  if (!length(trts) == 2) stop("Only two treatments supported.")

  if (sum(!trts %in% c("T", "R")) !=0)

    stop("treatments must be coded as 'T' and 'R'.")

  seqs  <- sort(unique(as.character(D$Seq)), decreasing = TRUE) # T first

  type  <- paste(seqs, collapse = "|") # identifier

  nseq  <- length(unique(as.character(D$Seq)))

  nper  <- length(unique(as.numeric(D$Per)))

  nsubj <- length(subjs)

  if (nchar(type) == 19) {  # 4-period 4-sequence full replicate designs

    if (nper != 4) stop("4 periods required in this full replicate design.")

    if (nseq != 4) stop("4 sequences required in this full replicate design.")

    des <- "full"

  }

  if (nchar(type) == 9) {  # 4-period full replicate designs

    if (nper != 4) stop("4 periods required in this full replicate design.")

    if (nseq != 2) stop("2 sequences required in this full replicate design.")

    des <- "full"

  }

  if (nchar(type) == 7) {  # 3-period replicates

    if (type %in% c("TRT|RTR", "TRR|RTT")) {

      if (nper != 3) stop("3 periods required in this full replicate design.")

      if (nseq != 2) stop("2 sequences required in this full replicate design.")

      des <- "full"

    }

    if (type == "TRR|RTR") {

      if (nper != 3) stop("3 periods required in the extra-reference design.")

      if (nseq != 2) stop("2 sequences required in the extra-reference design.")

      des <- "partial"

    }

  }

  if (nchar(type) == 11) { # Balaam's design or 3-sequence partial

    if (!type == "TR|RT|TT|RR") { # Balamm's

      if (nper != 3) stop("3 periods required in this partial replicate design.")

      if (nseq != 3) stop("3 sequences required in this partial replicate design.")

      des <- "partial"

    } else {                      # TRT|RTR|RRT

      if (nper != 2) stop("2 periods required in Balaam's design.")

      if (nseq != 4) stop("4 sequences required in Balaam's design.")

      des <- "full"

    }

  }

  if (type == "TR|RT") stop("TR|RT design not supported yet.")

  gm.T <- gm.R <- numeric(nsubj) # vector of geometric means

  for (subj in seq_along(subjs)) {

    gm.T[subj] <- mean(log(D$Y[which(D$Subj == subj & D$Trt == "T")]))

    gm.R[subj] <- mean(log(D$Y[which(D$Subj == subj & D$Trt == "R")]))

  }

  # get rid of missings

  gm.T <- gm.T[!is.na(gm.T)]
  

  gm.R <- gm.R[!is.na(gm.R)]
  

  if (des == "partial") { # guess varT (between + within)

    m    <- lm(log(Y) ~ Seq+Per, data = D[D$Trt == "T", ])

    varT <- anova(m)["Residuals", "Mean Sq"]


  } else {                # guess VarWT

    m     <- lm(log(Y) ~ Seq+Per+Subj, data = D[D$Trt == "T", ])

    varWT <- anova(m)["Residuals", "Mean Sq"]


  }

  # guess VarWR

  m     <- lm(log(Y) ~ Seq+Per+Subj, data = D[D$Trt == "R", ])

  varWR <- anova(m)["Residuals", "Mean Sq"]
  # guess varBR

  varBR <- var(gm.R)-varWR

  # TODO: full replicates

  varRT <- 0.5*(varT+varBR)

  rslt  <- c(varT, varBR, varWR, varRT)

  return(rslt)

}

MyREML <- function() {

  Ini <- Some.Initial.Guesses()

  F   <- optim(par = Ini, fn = Obj.F12,

               control = list(reltol = 1e-12, trace = 0, fnscale = -1))

  Nms <- c("var_T", "var_bR", "var_wR", "covar_bTbR")

  X   <- data.frame(Nms, F$par, Ini)

  names(X) <- c("Component", "Estimate", "Initial")

  cat("Maximum         :", F$value,

    "\nEvaluations     :", F$counts[["function"]],

    "\nConvergence code:", F$convergence, "\n")

  return(X)

}



# Data has to have five columns in this order:

#   Subject (numeric), Period (numeric), Sequence (character),

#   Treatment (character), Response (numeric)

# Column separator 'sep' comma (default), semicolon, or t (tab)

# Decimal separator 'dec' period (default) or comma

# path <- "https://bebac.at/downloads/ds01.csv" # EMA full replicate

path <- "https://bebac.at/downloads/ds02.csv" # EMA partial replicate

D   <- NULL # nullify eventual previous one

D   <- GetData(path)

X   <- CreateX(D)

y   <- log(D$Y)

# the workhorse

res <- MyREML() # patience!

G.matrix <- data.frame(c(NA, res$Estimate[4]),

                       c(res$Estimate[4], res$Estimate[2]))

names(G.matrix) <- 1:2

print(res, row.names = FALSE); cat("G matrix:\n"); print(G.matrix)

Gives:


Maximum         : 75.98722 

Evaluations     : 197 

Convergence code: 0 

Warning message:

In log(det(t(X) %*% solve(CovM) %*% X)) : NaNs produced



  Component   Estimate    Initial

      var_T 0.07049482 0.06841615

     var_bR 0.03621594 0.02776728

     var_wR 0.01324652 0.01240137

 covar_bTbR 0.04563735 0.04809171

G matrix:

           1          2

1         NA 0.04563735

2 0.04563735 0.03621594

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-13 16:18
(1784 d 22:13 ago)

@ Helmut
Posting: # 21686
Views: 29,167

took just 52 hrs to do it :-)

Post reply

Hi Hötzi,

I think expressing the G matrix after this fit is a little backward;
The whole point is not to decompose V into ZGZ^t+R, because this design makes that decomposition pointless even when the design is otherwise relevant.
That SAS and WinNonlin and more do not have a way to make provisions for a solution outside those involving this decomposition, it merely a testament to their development capabilities than to the design itself. If you start defining the the stats model on basis of G and R rather than V then obviously you must create two variance components which are not uniquely estimable, but whose sum is. Therefore, just screw R and G, go directly for V, and present the components without in any way calling them members of G or R - they really aren't when the model is constructed my way.

The 2-trt, 3-seq, 3-per design is good, healthy and appropriate, until now it was just the statistical work on it that has been lagging. The post is an attempt at providing a solution.

Note also: The PE will be accessible via the est.b vector. You shuld be able to extract it (difference between first two effects, one good reason to do X without intercept) after the model has converged.

If you have a dataset that does not readily converge with SAS then I'd be eager to learn if it does in some flavour of my code. :-)

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-13 17:34
(1784 d 20:57 ago)

@ ElMaestro
Posting: # 21688
Views: 29,271

Will take much more hours still…

Post reply

Hi ElMaestro,

❝ I think expressing the G matrix after this fit is a little backward;…

OK, true.

❝ That SAS and WinNonlin and more do not have a way to make provisions for a solution outside those involving this decomposition, it merely a testament to their development capabilities than to the design itself. If you start defining the the stats model on basis of G and R rather than V then obviously you must create two variance components which are not uniquely estimable, but whose sum is. Therefore, just screw R and G, go directly for V, and present the components without in any way calling them members of G or R - they really aren't when the model is constructed my way.

Yep.

❝ The 2-trt, 3-seq, 3-per design is good, healthy and appropriate, until now it was just the statistical work on it that has been lagging. The post is an attempt at providing a solution.

Kudos to your work!
I still see no reason to perform a study in one of the partial replicates. Sorry.
What is given as ‘justifications’:

“We want to limit the blood volume and decrease the chance of dropouts. Hence, we prefer 3 over 4 periods.”
Understandable. However, 3-period full replicates (TRT|RTR or TRR|RTT) require similar sample sizes than the partial replicates (TRR|RTR|RRT or TRR|RTR) but provide additional information about CV_wT. If this is a pilot study one gets an incentive for the sample size of the pivotal if CV_wT < CV_wR (which is often the case). In a partial replicates one has to assume CV_wT = CV_wR.
Furthermore, one may consider ABEL for AUC (WHO’s rules) where the variability associated with each product has to be compared.
“This design is recommended by the FDA and the EMA.”
False. The fact that the FDA gives only SAS-code for the partial replicate does not imply that it is the only acceptable one. There are a lot of different designs in the public domain which were accepted by the FDA. The FDA stated already in 2001: “… the two-sequence, three-period design TRR|RTT is thought to be optimal among three-period replicated crossover designs.”
In the EMA’s Q&A the partial replicate is given only as an example. Rev.12 (June 2015) made clear that TRT|RTR is acceptable as well. The only condition is that at least 12 eligible subjects are in sequence RTR. That’s not relevant in practice unless one faced dropout rates of >40% (see there).
If one has problems with the evaluation and/or want to perform simulations, heterogenicity is a pain in the back. Much easier in full replicates. That’s the reason why the two Lászlós in their early papers simulated TRT|RTR.

Nobody published reference datasets for the mixed-model so far. ;-)

Hence, everybody has to believe in the software. As a starter I would expect the same level of error handling, documentation, :blahblah:

like the package replicateBE. Otherwise, assessors would be skeptical.

❝ The PE will be accessible via the est.b vector. You shuld be able to extract it (difference between first two effects, one good reason to do X without intercept) after the model has converged.

In Section 2:

  Final <- function(Pars) {

    CovM <- Create.CovM(Pars)

    tmp  <- solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y

    PE   <- exp(as.numeric(tmp["TrtT", ] - tmp["TrtR", ]))

    return(PE)

  }

At the end:

  PE <- Final(res$Estimate)

  cat("PE:", PE, "\n")

Gives:
PE: 1.372138
In PHX 1.372138 (Patterson/Jones got in SAS 1.37). Great!
TODO: Get the Satterthwaite’s DFs for the CI.

❝ If you have a dataset that does not readily converge with SAS then I'd be eager to learn if it does in some flavour of my code. :-)

See this post with links to John’s datasets. Maybe there are others.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-14 05:22
(1784 d 09:09 ago)

(edited on 2020-07-14 12:43)
@ Helmut
Posting: # 21691
Views: 29,106

Negative determinant

Post reply

Hi Hötzi,

❝ In Obj.F12() the line

❝ B <- -0.5*log(det(t(X) %*% solve(CovM) %*% X))

❝ is the reason for the warnings because t(X) is not a square matrix. Therefore B is NaN.

I looked at this one by testing is.nan(B) and printing the components if/when it is. It happens right at the start for this data.

The problem is not that X changes dimensions, X is the design matrix and this is generally never a square matrix, just full column rank with as many rows as observations. X does not change at all during the optimisation.

All is in fact hunky dory when it happens, the variance estimates being:
[1] 0.06841615 0.03576077 0.01240137 0.05893007 (on my machine)
- that's nothing special, just a wee change from initial.

The issue here is that the determinant suddenly becomes apparently negative. This is mysterious to me at this point, I am not much into matrices. I take it as a case of NM taking some unfortunate decision, which is then corrected afterwards.

However this does give food for thought: if the parameter vector that gives a negative determinant is chosen as a starting value then NM will stop immediately and throw an error - you can try it with the vector above.
Therefore, it may be desirable to ascertain positiveness of that determinant (and the others?) when starting the algo, and to slightly modify individual starter values in case of such suboptimally chosen starter values until Obj.F12 actually returns something meaningful.

I believe the issue can not be remedied by fiddling with X.

I will be happy to hear a hardcore matrix algebrist's opinion here.

Update 8 hours later:
I think CovM becomes noninvertible at select levels of a variance component, given the others, when/if a certain value of the variance component means that one or more columns (rows) add up to one or more other columns. At exactly that point CovM will drop in its rank, and it becomes non-invertible; beyond the value that creates this phenomenon CovM is invertiable but the determinant may switch sign. It is a phenomenon with a singularity in the determinant easily explained in the numbers, but I cannot interpret the higher perspective. I am grateful that NM in the implementation of optim apparently safeguards against this phenomenon as long as the first call to the objective is evaluable. :-)

Whodathunkit?

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-14 15:58
(1783 d 22:33 ago)

@ ElMaestro
Posting: # 21692
Views: 29,039

Are we loosers?

Post reply

Hi ElMaestro,

❝ The problem is not that X changes dimensions, X is the design matrix and this is generally never a square matrix, just full column rank with as many rows as observations. X does not change at all during the optimisation.

Ah, yes!

❝ The issue here is that the determinant suddenly becomes apparently negative.

Might point towards a similar issue in PHX complaining about negative variance components and SAS forcing them to zero.

❝ This is mysterious to me at this point, I am not much into matrices.

So am I. Sorry.

❝ I take it as a case of NM taking some unfortunate decision, which is then corrected afterwards.

Luckily.

❝ However this does give food for thought: if the parameter vector that gives a negative determinant is chosen as a starting value then NM will stop immediately and throw an error - you can try it with the vector above.

❝ Therefore, it may be desirable to ascertain positiveness of that determinant (and the others?) when starting the algo, and to slightly modify individual starter values in case of such suboptimally chosen starter values until Obj.F12 actually returns something meaningful.

But how?

❝ I believe the issue can not be remedied by fiddling with X.

Agree.

❝ Update 8 hours later: …

Sounds reasonable.
But I think that we are going in circles. This story is all about RSABE where we have to go for ABE if s_wR <0.294 (or the mixed-model for ABE in general). The scaling part is easily doable in R. Since you found a solution for estimating s_wR and the point estimate, fine. However, if we have to go for ABE, we are still at a loss. Remember:

PROC MIXED CLASSES SEQ SUBJ PER TRT; MODEL Y = SEQ PER TRT/ DDFM=SATTERTH; RANDOM TRT/TYPE=FA0(2) SUB=SUBJ G; REPEATED/GRP=TRT SUB=SUBJ; ESTIMATE 'T vs. R' TRT 1 -1/CL ALPHA=0.1; run;

For the imbalanced dataset of this post PHX gives me:
Δ 0.03084859
SE_Δ 0.03256626
df 18.24154
and therefore, $\small{CI=\exp (\Delta\mp t_{0.05,df}\times SE)=\left \{ 0.9747, 1.0912 \right \}}$.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-14 17:07
(1783 d 21:24 ago)

@ Helmut
Posting: # 21693
Views: 28,914

"we"? Loosers?

Post reply

Hi again Hötzi,

still at a loss, how?
From my own perspective, this was not about creating a package (roof, walls) but just to find a way to use an optimiser to find the solid estimate of swr for the TRR/RTR/RRT design (base), and this is what was deemed rather difficult at the outset. It succeeded when I did not express V directly and not via ZGZ^t+R. Did not take so much time as I had feared.

Did you mean to say that now it was shown possible, there is a need for other components (and that these are the tricky ones?)? I am not sure I can do it, not sure I can't, either. Did not look into it at all, for me this part is a non-issue as is generalising the idea here to make it work for all other designs.
Will be happy to take a look at the other stuff, too, if I have something to contribute with.
For me, all this was just about the s2wr. For you ("we") maybe it is seen differently? Eager to learn of it, will see without being certain if I can produce something. :-)

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-14 17:32
(1783 d 20:59 ago)

@ ElMaestro
Posting: # 21695
Views: 29,112

Misunderstanding?

Post reply

Hi ElMaestro,

❝ still at a loss, how?

Amazingly you solved an old riddle – which is only part of the game in RSABE.

❝ […] Did not take so much time as I had feared.

Well, 60+ hours is a lot for a spare time project. Welcome to the club.

❝ Did you mean to say that now it was shown possible, there is a need for other components (and that these are the tricky ones?)?

Exactly. I still don’t see how we could mimic the FDA’s mixed model for ABE in R.

❝ I am not sure I can do it, not sure I can't, either.

Heads up. You could be the first succeeding in R. PharmCat produced something in Julia (see there). Duno the state of affairs.

❝ Will be happy to take a look at the other stuff, too, if I have something to contribute with.

❝ For me, all this was just about the s2wr. For you ("we") maybe it is seen differently? Eager to learn of it, will see without being certain if I can produce something. :-)

By “we” I’ve meant the community hoping for an open source solution.
Sorry for the huge amount of time you’ve spent on this. s_wR is only needed as a decision point whether to go for RSABE or ABE. Great that we can get it now in R but the evaluation for ABE remains a mystery.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-13 12:13
(1785 d 02:18 ago)

@ Helmut
Posting: # 21681
Views: 29,351

Braveheart!

Post reply

Hi all,

The solution I posted is a total winner. :-D

Where do I collect my medal? :-D

Will be happy to learn of performance on other datasets, in particular of differences between results or non-convergence (which is often something related to instability caused by the initial guess being too far from the optimum). Obviously, with this implementation the input format has to meet the expectation of the algo. For example, T and R coded as T and R, not A and B, in this implementation. All this is unrelated to the task of this thread and can be rather easily worked out as well.

A slight change this morning, for the initial guesses, less clumsy and more likely (in the nonrestricted fashion ;-)

) to reflect the optimal solution:


Some.Initial.Guesses=function(foo)

{

 #guess varT

 D1=subset(D, (D$Trt=="T"))

 m=lm(log(Y)~factor(Seq)+factor(Per)   , data=D1)

 varT=anova(m)["Residuals", "Mean Sq"] 

 

 #guess VarWR

 D1=subset(D, (D$Trt=="R"))

 m=lm(log(Y)~factor(Seq)+factor(Per)+factor(Subj)  , data=D1)

 varWR=anova(m)["Residuals", "Mean Sq"]  



 #guess varBR

 D1=subset(D, (D$Trt=="R"))

 m=lm(log(Y)~factor(Seq)+factor(Per)   , data=D1)

 varBR=anova(m)["Residuals", "Mean Sq"] - varWR



 #guess varRT (between), must tbe closer to the other two betweens.

 ##perhaps mean is a decent guess.

 varRT=0.5*(varT+varBR)



 rslt=c(varT, varBR, varWR, varRT)

 return(rslt)

}

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-13 16:16
(1784 d 22:15 ago)

@ ElMaestro
Posting: # 21685
Views: 29,499

Braveheart!

Post reply

Hi ElMaestro,

❝ The solution I posted is a total winner. :-D :-D :-D Where do I collect my medal? :-D :-D :-D

Hey, it looks great indeed! However, that’s only the foundation of the house. Walls, a door, windows, and a roof would be nice.

❝ Will be happy to learn of performance on other datasets, …

Try my code with
path <- "https://raw.githubusercontent.com/Helmut01/replicateBE/master/inst/extdata/DS04.csv"
That’s an ~~incomplete~~ balanced data set (C_max of Table II).¹

PHX (convergence after 8 iterations without warnings, -2REML 292.14, AIC 314.14):

lambda(1,1) 0.43651691

lambda(1,2) 0.63844444

lambda(2,2) 0.41685382

s2wR        0.31426995

s2wT        0.00718607

Manual:
lambda(1,1)^2=var_bR=0.1905470

My master’s code:

  Component  Estimate     Initial

      var_T 0.5885621 0.591211271

     var_bR 0.1905466 0.008845665

     var_wR 0.3142704 0.318270251

 covar_bTbR 0.2786906 0.300028468

Not that bad. ;-)

❝ … in particular of differences between results or non-convergence (which is often something related to instability caused by the initial guess being too far from the optimum).

See this post with links to John’s nasty datasets. Will see whether I still have the one which failed completely both in SAS and PHX.

❝ A slight change this morning, for the initial guesses, less clumsy and more likely (in the nonrestricted fashion ;-) ) to reflect the optimal solution:

Will check that later.²

Patterson SD, Jones B. Viewpoint: observations on scaled average bioequivalence. Pharm Stat. 2012;11(1):1–7. doi:10.1002/pst.498.
Edit: This time a really imbalanced one (1 subject in RRT and 1 in RTR removed):
https://bebac.at/downloads/ds02a.csv
Gives:
Maximum : 70.14324 Evaluations : 269 Convergence code: 0 Component Estimate Initial var_T 0.06800845 0.06571035 var_bR 0.03821576 0.03199571 var_wR 0.01065516 0.01009356 covar_bTbR 0.04412955 0.04885303 PE: 1.031329
In PHX (Warning: Newton's algorithm converged with modified Hessian. Output is suspect.):
s2wR 0.010655215 PE: 1.0313293

With your modified code and cosmetic tweaks:
Nelder-Mead optimization (39 seconds) Final 70.14324 Evaluations 283 Successful completion. Component Estimate Initial var_T 0.06800851 0.06571035 var_bR 0.03821582 0.03750818 var_wR 0.01065521 0.01009356 covar_bTbR 0.04412958 0.05160927 Parameter Value swR 0.1032241 CVwR 10.35% Point estimate 1.031329

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

PharmCat
★

Russia,
2020-07-15 16:19
(1782 d 22:12 ago)

@ ElMaestro
Posting: # 21699
Views: 28,751

Braveheart!

Post reply

Hello all!

As I understand some code for mixed model solution was done? Great! :-D

❝ Will be happy to learn of performance on other datasets, in particular of differences between results or non-convergence (which is often something related to instability caused by the initial guess being too far from the optimum).

Here you can find some self generated (and some public) datasets for package validation and results from SPSS here.

Also you can make any random dataset with any design with Julia, docs here.

Is is possible to make easy performance comparison on my machine?

Edit: Full quote removed. Please delete everything from the text of the original poster which is not necessary in understanding your answer; see also this post #5! [Helmut]

ElMaestro
★★★

Denmark,
2020-08-02 19:39
(1764 d 18:52 ago)

@ PharmCat
Posting: # 21817
Views: 26,277

Braveheart!

Post reply

Hi PharmCat,

❝ Is is possible to make easy performance comparison on my machine?

the code on this page would suffice for the comparison, however I am not going in the direction of estimating denominator DF's and calculating CI's.

I am solely interested in redefining our specification of the covariance matrix so that the model that is being fit contains exactly the variance components we can realistically ask about for a 3-period semireplicated study.

I have reached that goal.
Also, I am writing an optimiser and it works bloody great! Not terribly fast but seems to get there in a very orderly fashion :-)

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-15 16:27
(1782 d 22:04 ago)

@ ElMaestro
Posting: # 21700
Views: 28,834

We were all blind (except Detlew)

Post reply

Hi ElMaestro,

we were all blind!
Just to make clear what I mean by “we”: Myself, you, the EMA’s PKWP (and the BSWP fabricating the example datasets and the evaluation by ‘Method C’), and Patterson & Jones.
Detlew read the guidance’s Step 1 thoroughly enough to realize that $\small{s_\textrm{wR}^2}$ is not an REML-estimate (see this post). Fortunately the templates for Phoenix are correct. Quick & dirty R-code at the end. Results:

      file             descr       s2wR       swR CVwR (%)      PHX match

  ds02.csv            EMA II 0.01298984 0.1139730 11.43441 11.43441   yes

 ds02a.csv     EMA II imbal. 0.01038611 0.1019123 10.21774 10.21774   yes

  DS04.csv Patterson & Jones 0.32489813 0.5699984 61.95883 61.95883   yes

  ds01.csv             EMA I 0.2188968  0.4678640 46.96431 46.96431   yes

In the Q&A we find for DSII the REML-based CV_wR 11.5% (PHX 11.43%). Patterson & Jones reported for their dataset 61% (SAS); I got in PHX 61.96%. For DSI the Q&A gives 47.3% (PHX 46.96%).

Therefore, congratulations to all of us answering a question which was not asked by the FDA. :-(

GetData <- function(path) { D <- NULL # paranoia D <- read.csv(path, comment.char = "#") D <- D[, 1:5] D[, 5] <- log(D[, 5]) names(D) <- c("Subj", "Per", "Seq", "Trt", "PK") return(invisible(D)) } CalcSwR <- function(path) { D <- GetData(path) R <- D[D$Trt == "R", ] R <- cbind(R, Diff = NA) seqs <- sort(unique(D$Seq), decreasing = TRUE) des <- NA if (sum(seqs %in% c("TRR", "RTR", "RRT")) == 3) des <- "partial" if (sum(seqs %in% c("TRTR", "RTRT")) == 2) des <- "full" if (is.na(des)) stop("Design not supported.") if (des == "partial") { TRR <- R[R$Seq == "TRR", c(1:3, 5:6)] RTR <- R[R$Seq == "RTR", c(1:3, 5:6)] RRT <- R[R$Seq == "RRT", c(1:3, 5:6)] # keep only subjects with two observations TRR <- TRR[duplicated(TRR$Subj, fromLast = TRUE) | duplicated(TRR$Subj, fromLast = FALSE), ] RTR <- RTR[duplicated(RTR$Subj, fromLast = TRUE) | duplicated(RTR$Subj, fromLast = FALSE), ] RRT <- RRT[duplicated(RRT$Subj, fromLast = TRUE) | duplicated(RRT$Subj, fromLast = FALSE), ] # first minus second administration # too lazy to vectorize for (j in 1:(nrow(TRR)-1)) { if (TRR$Per[j] == 2) { TRR$Diff[j] <- TRR$PK[j] - TRR$PK[j+1] } } for (j in 1:(nrow(RTR)-1)) { if (RTR$Per[j] == 1) { RTR$Diff[j] <- RTR$PK[j] - RTR$PK[j+1] } } for (j in 1:(nrow(RRT)-1)) { if (RRT$Per[j] == 1) { RRT$Diff[j] <- RRT$PK[j] - RRT$PK[j+1] } } R <- rbind(TRR[, c(3, 5)], RTR[, c(3, 5)], RRT[, c(3, 5)]) } else { TRTR <- R[R$Seq == "TRTR", c(1:3, 5:6)] RTRT <- R[R$Seq == "RTRT", c(1:3, 5:6)] TRTR <- TRTR[duplicated(TRTR$Subj, fromLast = TRUE) | duplicated(TRTR$Subj, fromLast = FALSE), ] RTRT <- RTRT[duplicated(RTRT$Subj, fromLast = TRUE) | duplicated(RTRT$Subj, fromLast = FALSE), ] for (j in 1:(nrow(TRTR)-1)) { if (TRTR$Per[j] == 2) { TRTR$Diff[j] <- TRTR$PK[j] - TRTR$PK[j+1] } } for (j in 1:(nrow(RTRT)-1)) { if (RTRT$Per[j] == 1) { RTRT$Diff[j] <- RTRT$PK[j] - RTRT$PK[j+1] } } R <- rbind(TRTR[, c(3, 5)], RTRT[, c(3, 5)]) } R <- R[!is.na(R$Diff), ] m <- lm(Diff ~ as.factor(Seq), data = R) s2wR <- anova(m)["Residuals", "Mean Sq"]/2 res <- c(s2wR = s2wR, swR = sqrt(s2wR), CVwR = 100*sqrt(exp(s2wR)-1)) names(res)[3] <- "CVwR (%)" return(res) } p <- paste0("https://", c("bebac.at/downloads/ds02.csv", "bebac.at/downloads/ds02a.csv", "raw.githubusercontent.com/Helmut01/replicateBE/master/inst/extdata/DS04.csv", "bebac.at/downloads/ds01.csv")) d <- c("EMA II", "EMA II imbal.", "Patterson & Jones", "EMA I") res <- data.frame(file = basename(p), descr = d, X1 = NA, X2 = NA, X3 = NA, PHX = c(11.43441, 10.21774, 61.95883, 46.96431), match = "no") names(res)[3:5] <- c("s2wR", "swR", "CVwR (%)") for (j in 1:4) { res[j, 3:5] <- CalcSwR(path = p[j]) if (round(res[j, 5], 5) == res$PHX[j]) res$match[j] <- "yes" } print(res, row.names = FALSE)

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-15 18:25
(1782 d 20:07 ago)

@ Helmut
Posting: # 21701
Views: 28,860

It is the opposite way around for me

Post reply

Hi again,

to me it is exactly the opposite way around:
A model with more than one variance component is a mixed model. I am of the impression that FDA and EMA have both been in a slight predicament with their guidances because they have both been "somewhat" aware of the specification/convergence issues with REML; hence both went in a direction with all effect s fixed as done explicitly by EMA and implicitly by FDA (when they use GLM for swr) in the guidance.
My contribution here is solely to open someone's eyes to the fact that we may not need to rely on a normal linear model for s2wr in a TRR/RTR/RRT design.

I would much more like to think of this thread as a contribution to regulators defining the next iteration of giudelines/guidances. Whether they will see it as an advantage to use REML to derive s2wr now it is shown to be doable (not causing convergence issues, because it is not involving the silly and unestimable s2wt component), is of course their decision solely. I have done my part.

For consideration further down the line, for those who enter discussions about estimator bias being catastrophic: Which estimator is more biased when handling RRT/RTR/TRR: the s2wr you get from the linear model or the one you get via REML? What would your answer imply for any future recommendations for guidelines/guidances?
(I am not a hardliner myself: I am usually thinking biased estimators are still useful estimators)

This thread already caught the attention of a number cruncher at an agency across the Atlantic unofficially, so I am OK with it all :-)

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-15 19:33
(1782 d 18:58 ago)

@ ElMaestro
Posting: # 21702
Views: 28,809

Desultory thoughts

Post reply

Hi ElMaestro,

❝ to me it is exactly the opposite way around:

❝ A model with more than one variance component is a mixed model.

Not necessarily. It depends on what you believe [sic] is a random effect. Treatment, sequence, and period are fixed effects, right? IMHO, subject is random. When we think about interaction(s) we enter the gray zone, of course.

❝ I am of the impression that FDA and EMA have both been in a slight predicament with their guidances because they have both been "somewhat" aware of the specification/convergence issues with REML; hence both went in a direction with all effect s fixed as done explicitly by EMA and implicitly by FDA (when they use GLM for swr) in the guidance.

Well, the EMA’s models are wacky because they assume equal intra-subject variances of T and R. An assumption which was shown to be false in many cases. Essentially most of the information obtainable in a replicate design is thrown away.
I don’t know why the FDA does not directly estimate $s_{wR}^2$ from the mixed model but work with the intra-subject contrasts of R instead. More below.

❝ My contribution here is solely to open someone's eyes to the fact that we may not need to rely on a normal linear model for s2wr in a TRR/RTR/RRT design.

OK.

❝ I would much more like to think of this thread as a contribution to regulators defining the next iteration of giudelines/guidances. Whether they will see it as an advantage to use REML to derive s2wr now it is shown to be doable (not causing convergence issues, because it is not involving the silly and unestimable s2wt component), is of course their decision solely. I have done my part.

With the current guidance one gets always an estimate of $s_{wR}^2$. No problems in the RSABE-part if ≥0.0294. Only in the mixed-model for ABE and the partial replicate designs one may run into the convergence issues. That’s not resolved yet, unless one uses a simpler¹ variance-covariance structure, which is against the guidance. But – again and again – there are cases were nothing helped, neither in SAS nor in Phoenix. Duno whether it is possible to specify Nelder-Mead in SAS or Phoenix.² That’s why I still hold that the partial replicates are lousy designs because one might end up with no result for ABE unless one opts for the EMA’s models (which are wrong, IMHO).
OK, one step back. Since the mixed model for ABE is recommended by the FDA for ages, I wonder what people have done when they faced convergence problems…

❝ For consideration further down the line, for those who enter discussions about estimator bias being catastrophic: Which estimator is more biased when handling RRT/RTR/TRR: the s2wr you get from the linear model or the one you get via REML?

Good question, next question. _{Maybe REML…}

❝ What would your answer imply for any future recommendations for guidelines/guidances?

Oh dear! I participated in all GBHI workshops. The discussions about reference-scaling (Rockville 2016, Amsterdam 2018) were disastrous. What do we have?

US and China
- RSABE (any PK metric)
- No clinical justification required.³
- No upper cap on scaling.
- Mixed model for ABE, ISC for RSABE.
EMA, WHO, and many more
- ABEL (C_max; for modified release additionally C_min/C_τ, partial AUCs).
- Upper cap of expansion at CV_wR 50%.
- All effects fixed.
- Clinical justification (no safety issues wider wider limits).
- Demonstration that CV_wR is not caused by outliers.
Health Canada
- Like above but AUC only and upper cap 57.4%.
- Mixed effects model.
- Outliers can be excluded if procedure specified in the protocol (irrespective of the treatment).

And yes, problems with inflation of the Type I Error, which all regulators prefer to ignore. Kudos to some companies taking the issue more seriously and adjusting α downwards.

❝ (I am not a hardliner myself: I am usually thinking biased estimators are still useful estimators)

Agree.

Actually it is the other way ’round. The model of the guidance is overspecified (read: wrong) for partial replicates.
I played around with a lot of optimizers / settings in R the last three days. Only the grid search was stable (though slow). All others ran into singularities even with tweaked parameter constraints.
Yeah, highly variable drugs are safe drugs. Instead of coming up with guidances for the “highly variable narrow therapeutic index drugs” dagibatran and rivaroxaban, IMHO, they should be taken off the market. Regulators should be concerned about protecting the public health and not the profits of the industry.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2020-07-15 20:13
(1782 d 18:18 ago)

@ Helmut
Posting: # 21703
Views: 28,820

FDA RSABE is ISC

Post reply

Hi Helmut, Hi ElMaestro,

❝ ...

❝ Well, the EMA’s models are wacky because they assume equal intra-subject variances of T and R. An assumption which was shown to be false in many cases. Essentially most of the information obtainable in a replicate design is thrown away.

Full ACK!

❝ I don’t know why the FDA does not directly estimate $\small{s_\textrm{wR}^2}$ from the mixed model but work with the intra-subject contrasts of R instead. ...

I was always wondering about the different methodologies in the code of the progesteron guidance:

ISC analysis for estimating CVwR and decision for using RSABE or ABE based on this estimate
Mixed model analysis of the 90% CI of T-R (ABE analysis) in case of swR < 0.029356 (CVwR < 30%)
ISC analysis for estimating the point estimate of T-R including 90% CI and its use in the RSABE criterion in case of swR >= 0.029356 (CVwR >= 30%)

Why not use the ISC estimate of T-R in the ABE decision in case of swR < 0.029356 (CVwR < 30%)?
Politics? Nostalgia (Since years recommended the Proc MIXED code)?

Or the other way round: Why not use the components of the RSABE criterion (pe and 90% CI, s2wR) from the mixed model approach?

I would opt for the full ISC approach because it may be unambiguously implemented in SAS, R, Phoenix and so on for every replicate design, full or partial :cool:

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2020-07-16 13:11
(1782 d 01:20 ago)

@ d_labes
Posting: # 21709
Views: 28,507

FDA RSABE is ISC

Post reply

Hi Detlew,

❝ Why not use the ISC estimate of T-R in the ABE decision in case of swR < 0.029356 (CVwR < 30%)?

❝ Politics? Nostalgia (Since years recommended the Proc MIXED code)?

The guidance “Statistical Approaches to Establishing Bioequivalence” of January 2001 (where the same SAS code is given in APPENDIX E) is still in force. Hence, it is only consistent to give it in the progesterone guidance.

❝ I would opt for the full ISC approach because it may be unambiguously implemented in SAS, R, Phoenix and so on for every replicate design, full or partial :cool: .

Cannot agree more. All problems would vanish.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-16 01:06
(1782 d 13:25 ago)

@ Helmut
Posting: # 21705
Views: 28,592

Desultory thoughts

Post reply

Hi Hötzi,

❝ ❝ to me it is exactly the opposite way around:

❝ ❝ A model with more than one variance component is a mixed model.

❝

❝ Not necessarily. It depends on what you believe [sic] is a random effect. Treatment, sequence, and period are fixed effects, right? IMHO, subject is random. When we think about interaction(s) we enter the gray zone, of course.

May I offer the completely opposite view? Search the forum for "bogus statement" :-D

, check the SAS documentation for what the statement actually actually does to Proc GLM:
When ProcGLM is used for 222BE with or without the bogus statement it still means a normal linear model is fit (even when the theory in e.g. C&L calls it "random"). There is a single variance component in such fits. If you wish to verify it: There is a model matrix with columns for subjects; check the df's for such models, it would be (much) higher if subjects were fit as random. Conversely, if subject were random subject would not appear as a factor in the anova.

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2020-07-16 12:59
(1782 d 01:32 ago)

@ ElMaestro
Posting: # 21708
Views: 28,509

Desultory thoughts

Post reply

Hi ElMaestro,

❝ May I offer the completely opposite view? Search the forum for "bogus statement" :-D , check the SAS documentation for what the statement actually actually does to Proc GLM: […]

I know, I know. ;-)

I was talking about a mixed model and not the way how SAS uses the RANDOM statement in PROC GLM in order to get the DFs right.
A true all fixed effects model – with subjects instead of the crazy subjects(sequence) – requires unique coding of subjcts. In practice always the case (OK, maybe not in multi-center studies but then you have center as another fixed effect). I recently reviewed a manuscript where subjects in each sequence had the same codes. The authors obviously had no clue about how things work.

BTW, same agencies require that in crossovers CV_intra and CV_inter are reported. The latter is not possible when subject is a fixed effect.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

mittyri ★★ Russia, 2020-07-19 02:42 (1779 d 11:49 ago) @ Helmut Posting: # 21745 Views: 28,022	Phoenix - which template? Post reply
	Hi Helmut, ❝ Detlew read the guidance’s Step 1 thoroughly enough to realize that $\small{s_\textrm{wR}^2}$ is not an REML-estimate (see this post). Fortunately the templates for Phoenix are correct. are you referring to that template for RSABE? — Kind regards, Mittyri

Helmut
★★★

Vienna, Austria,
2020-07-19 03:45
(1779 d 10:46 ago)

@ mittyri
Posting: # 21747
Views: 30,642

FDA RSABE Project template_ v1.4.phxproj

Post reply

Hi mittyri,

❝ are you referring to that template for RSABE?

Nope. I was using “FDA RSABE Project template_ v1.4.phxproj” of 2014-11-11 (should be accessible at https://www.certarauniversity.com/store/637337-106-fl-free-phoenix-templates-for-bioequivalence-regulatory-guidances but my credentials were not accepted (why the heck?) and when I tried to reset the password I didn’t receive the confirmation e-mail…).
Relevant part:

FDA RSABE Project template_ v1.4 ↳ Workflow ↳ RSABE ↳ Prepare Data for RSABE analysis ↳ Intermediate ilat (Dependent: ilat, Fixed Effect: Sequence) Result: Final Variance Parameters Estimate ↳ Transformation Type = Arithmetic Transformation = x / n New Column Name = Var_wr n = 2 Var_wr ↳ Transformation Type = Functions Transformation = Square root(x) New Column Name = sWR

The main difference to v1.3 is that v1.4 uses the function chiinv(0.95,df) further down which was not available in PHX6.3 and a table of values had to be used instead.
However, with v1.3 you get the same result like in this post.

PS: Ref.35 of doi:10.1208/s12248-020-0427-6 leads to limbo. Not very professional. I asked Certara’s support to establish a permanent redirect on the server.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2020-07-24 12:07
(1774 d 02:24 ago)

@ ElMaestro
Posting: # 21783
Views: 27,237

"By popular demand": likelihood

Post reply

Hi all,

to make the objective function return the likelihood, use e.g.:


Obj.F12=function(Pars)

##returns the restricted log likelihood.

##apart from k, this is line 3 of page 10 of:

##http://people.csail.mit.edu/xiuming/docs/tutorials/reml.pdf

##k is invariant with parameter estimates and variance components.

{

  CovM=Create.CovM(Pars)

  k=  -((length(y)-ncol(X))/2)*log(2*pi)

  A= -0.5*log( det(CovM)) 

  B= -0.5*log( det(t(X) %*% solve(CovM) %*% X)) 

  est.b = solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y

  tmp= y - X %*% est.b

  C=-0.5 *(t(tmp) %*% solve(CovM) %*% tmp)

  return(k+A+B+C)

}

The optimised object (F in my original code) will hold the optimised likelihood as F$value.
Opportunity: Will run a tiny bit faster if k is public, and if solve(CovM) is done a single time.

For example at reltol=1e-8 I get with EMA dataset II a value of 15.33728.

This matches exactly what SAS reports, see post # 21670 above.

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2020-07-24 14:52
(1773 d 23:39 ago)

@ ElMaestro
Posting: # 21785
Views: 27,202

And by the way....

Post reply

Obj.F12=function(Pars)

##returns the restricted log likelihood.

##apart from k, this is line 3 of page 10 of:

##http://people.csail.mit.edu/xiuming/docs/tutorials/reml.pdf

##k is defined from the number of observations, and .

{

  CovM=Create.CovM(Pars)

  k=  -((length(y)-ncol(X))/2)*log(2*pi)

  A= -0.5*log( det(CovM)) 

  B= -0.5*log( det(t(X) %*% solve(CovM) %*% X)) 

  est.b = solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y

  tmp= y - X %*% est.b

  C=-0.5 *(t(tmp) %*% solve(CovM) %*% tmp)

  return(k+A+B+C)

}

....this is exactly also the REML likelihood function that SAS defines.

—
Pass or fail!
ElMaestro

PharmCat ★ Russia, 2020-08-03 16:24 (1763 d 22:07 ago) @ ElMaestro Posting: # 21820 Views: 26,303	And by the way.... Post reply
	❝ ....this is exactly also the REML likelihood function that SAS defines. Hello! Do you calculate REML for all data at once or calculate REML subject by subject? Is it possible to use function caching in R?

ElMaestro
★★★

Denmark,
2020-08-04 00:55
(1763 d 13:36 ago)

@ PharmCat
Posting: # 21821
Views: 26,199

Not understood

Post reply

Hi Pharmcat,

❝ Do you calculate REML for all data at once or calculate REML subject by subject?

REML is not generally done on a per-subject basis. Each model is assumed to have an optimal fit, given the data and the variance/covariance structure. The optimal fit, when putatively found, corresponds to the fixed effects and the variance components, and the log likelihood serves as objective function.
I am sure I am missing something clever in your question. Can you describe what you meant?

❝ Is it possible to use function caching in R?

You can always write some public members to that effect where necessary and appropriate. I don't think R has a built-in optimizer which takes care of it as the language is interpreted, but other users of this forum should confirm; I am not expert on the inner workings of R.
I do not see how this would affect the REML process - can you describe with more words what your perspective on caching was, then we'll possibly figure out something?

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-05 03:41
(1762 d 10:50 ago)

@ ElMaestro
Posting: # 21823
Views: 26,123

Not understood

Post reply

❝ Can you describe what you meant?

Hello ElMaestro!

logREML of all data is just sum of individual subject logREML. See Mary J. Lindstrom and Douglas M. Bates, Newton-Raphson and EM Algorithms for Linear Mixed-Effects Models for Repeated-Measures Data.

You can make big sparce V matrix for all data and then inverce it (solve(CovM) in your code above) or you can take indiviual X_i and V_i and calculate logREML for each subject and sum.

Problem1: when you have big matrix invercing is slow as hell, but if you inverce by blocks (you can do it because it block-diagonal) it can be done faster.

Problem2: In BE case you have many equal Z_i matrices, so in logREML calculation (if you calc by subject) you inverce one matrix many times (because V = ZGZ'+R, Z can be different, G and R not changing subject by subject). If you cache result - you can seriously increase performance.

So, I didn't read all R code (sorry) and I didn't understand what approach is used.

ElMaestro
★★★

Denmark,
2020-08-05 10:13
(1762 d 04:18 ago)

@ PharmCat
Posting: # 21824
Views: 26,131

Not understood

Post reply

Thanks PharmCat,

funny, I read the ref. you mention (Lindstrom & Bates) yesterday and I did not understand much of it. Will try again.

❝ Problem1: when you have big matrix invercing is slow as hell, but if you inverce by blocks (you can do it because it block-diagonal) it can be done faster.

Sounds neat, will check if I can somehow do such a thing.

❝ Problem2: In BE case you have many equal Z_i matrices, so in logREML calculation (if you calc by subject) you inverce one matrix many times (because V = ZGZ'+R, Z can be different, G and R not changing subject by subject). If you cache result - you can seriously increase performance.

My approach is very simple: I make a case for saying that when T is not replicated while R is replicated, then it is not possible to make a meaningful model when V=ZGZt+R (as long as we count on R having the withins and G having the betweens). If we insest on ZGZt+R then we get irrelevant variance components associated with T whichever way we parameterise the covariance matrix. Therefore, I am working in V directly, meaning I am writing the blocks into it directly.
This may still be compatible with your proposal, though.

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-05 18:37
(1761 d 19:54 ago)

(edited on 2020-08-06 00:34)
@ ElMaestro
Posting: # 21825
Views: 26,127

Not understood

Post reply

Hello!

❝ Sounds neat, will check if I can somehow do such a thing.

I found Lindstrom&Bates, equation 3.4 or J. GURKA, 2006, equation 5.

Some experiment:

m = matrix(c(5, 1, 0, 0, 1, 2, 0, 0, 0, 0, 5, 3, 0, 0, 3, 4), nrow = 4, ncol = 4)

m

     [,1] [,2] [,3] [,4]


[1,]    5    1    0    0

[2,]    1    2    0    0

[3,]    0    0    5    3

[4,]    0    0    3    4

log(det(m))

[1] 4.59512

log(det(m[1:2, 1:2])) + log(det(m[3:4, 3:4]))

[1] 4.59512

solve(m)

           [,1]       [,2]       [,3]       [,4]


[1,]  0.2222222 -0.1111111  0.0000000  0.0000000

[2,] -0.1111111  0.5555556  0.0000000  0.0000000

[3,]  0.0000000  0.0000000  0.3636364 -0.2727273

[4,]  0.0000000  0.0000000 -0.2727273  0.4545455

solve(m[1:2, 1:2])

           [,1]       [,2]


[1,]  0.2222222 -0.1111111

[2,] -0.1111111  0.5555556

solve(m[3:4, 3:4])

           [,1]       [,2]


[1,]  0.3636364 -0.2727273

[2,] -0.2727273  0.4545455

So V_i can be result of function ƒ(Z_i, θ); Z_i is known for all subjects, θ - vector of optimizables parameters.

❝ My approach is very simple: I make a case for saying that when T is not replicated while R is

❝ replicated, then it is not possible to make a meaningful model when V=ZGZt+R (as long as we count on R having the withins and G having the betweens).

Mmm... I can't say that ZGZ'+R is meaningful. If you construct V manualy it can still be represented as ZGZ'+R if ρ not equal 0. If used random model with covariation (ρ≠0) nondiagonal elements of G depends on diagonal and when you construct V some nondiagonal elements depends on both variance parameters. Then we add diagonal R matrix and have combination (parts of V): σ²_BT+σ²_WT, σ²_BR+σ²_WR, σ²_BR, σ_BT*σ_BR*ρ (we have no σ²_BT) so we can estimate σ²_WT from σ²_BT+σ²_WT and σ²_BT is stable because it is part of σ_BT*σ_BR*ρ (Yes estimate of σ²_WT is indirect). If ρ = 0 we have σ_BT*σ_BR*ρ = 0, and we can estimate only σ²_BT+σ²_WT.

And I think that V can be decomposed to ZGZ'+R if you know Z, ρ (ρ≠0) and σ²_BR in model with covariance and diagonal R.

ElMaestro
★★★

Denmark,
2020-08-06 23:11
(1760 d 15:20 ago)

@ PharmCat
Posting: # 21827
Views: 25,837

Open issues

Post reply

Hi PharmCat,

I really like your idea. I don't have a good understanding of this as a whole but I know that when I work with a little these things they become easier for me to grasp, so I am giving it a try. :-)

Let us turn to Gurka 2006 equation 3 and 5 and remembering that the rank of our overall design matrix X is 6:
Gurka requires me to derive Xi, the design matrix for fixed effects of subject i.
They mention that "Xi is a (ni × p) known, constant design matrix for the i' th subject
with rank p"

At just three observations per subject we cannot have a rank of p=6 for Xi. The maximum rank is naturally 3. I don't see a way to handle Xi on a subject by subject basis, but perhaps you have some insight. Possibly I could do it by lumping subjects together so that the "smaller" X of sorts still has more rows thank columns and thus a rank of p=6.

Do you see my issue with Xi? Could you indicate your view on Xi here? Is lumping subjects together a way to go to ensure the right rank of the reduced design matrices? Many thanks in advance.

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-07 02:02
(1760 d 12:29 ago)

@ ElMaestro
Posting: # 21828
Views: 25,827

Open issues

Post reply

Hi ElMaestro!

❝ Gurka requires me to derive Xi, the design matrix for fixed effects of subject i.

❝ They mention that "Xi is a (ni × p) known, constant design matrix for the i' th subject

❝ with rank p"

I think it means rank of whole X, because after: "and β is a (p × 1) vector of unknown, constant population parameters". X_i is a part of X and have same count of columns, rank X_i can be lower. Here is no contradictions I think. If look at Xiuming Zhang, 2015* p is just length of β and rank should be the same. If not, we will have rank-deficient matrix.

And if look generally - it really doesn't matter because -(N-p)*log(2π)/2 is a constant and it can be excluded from optimization. If you want find AIC ets - this will matter.

And as I know some softaware using different N for calculation (some use all observations, some only statistically independent (m)), and some include second constant part: logdet(Σ_i^m(X'X))/2

So if you want find only θ, you can minimize REML without constants at all. And add constants when you need to have real value of REML or AIC, BIC, ets.

*A Tutorial on Restricted Maximum Likelihood Estimation in Linear Regression and Linear Mixed-Effects Model

Example N=3, n=9, p=6

X:

β1 β2 β3 β4 β5 β6



1  1  0  1  0  0

1  1  0  1  0  0 

1  1  0  1  0  0

1  0  1  0  1  0

1  0  1  0  1  0

1  0  1  0  1  0

1  1  0  0  0  1

1  1  0  0  0  1

1  1  0  0  0  1

X₁

β1 β2 β3 β4 β5 β6



1  1  0  1  0  0

1  1  0  1  0  0 

1  1  0  1  0  0

X₂

β1 β2 β3 β4 β5 β6



1  0  1  0  1  0

1  0  1  0  1  0

1  0  1  0  1  0

X₃

β1 β2 β3 β4 β5 β6



1  1  0  0  0  1

1  1  0  0  0  1

1  1  0  0  0  1

ElMaestro
★★★

Denmark,
2020-08-07 09:49
(1760 d 04:42 ago)

@ PharmCat
Posting: # 21829
Views: 25,875

Open issues

Post reply

Hi PharmCat,

❝ And if look generally - it really doesn't matter because -(N-p)*log(2π)/2 is a constant and it can be excluded from optimization. If you want find AIC ets - this will matter.

I am afraid you missed my point. The issue is not related to constants. As you say they are not of importance to find the maximum, which will just be vertically displaced.

Xi is the design matrix for fixed effects for the i'th subject.
Thus is has a rank of max 3 and never more. This means we can at most use it to work with three parameter estimates. For example an intercept, a treatment effect, a period effect (within any given subject).

Thus we cannot use Xi -if i is a single subject- during the optimization for the RTR/TRR/RRT design. Applying Gurkas equations will not work and I am looking for a way to work on derived versions of X (like for a few subjects at a time?) so that X retains in rank of 6 (in this design).

❝ So if you want find only θ, you can minimize REML without constants at all. And add constants when you need to have real value of REML or AIC, BIC, ets.

Don't worry, the constants are of no importance to find the variance components.

❝ Example N=3, n=9, p=6

These examples of X are possibly not very illustrative; as you can see column 2+3 add up to column 1, the intercept, column 4+5+6 add up to that as well, and so forth. X is column-wise not full rank.

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-07 13:29
(1760 d 01:02 ago)

@ ElMaestro
Posting: # 21830
Views: 25,889

Open issues

Post reply

Hi ElMaestro!

I thought you asked about p, sorry :)

❝ Xi is the design matrix for fixed effects for the i'th subject.

❝ Thus is has a rank of max 3 and never more. This means we can at most use it to work with three parameter estimates. For example an intercept, a treatment effect, a period effect (within any given subject).

I think X_i is not shuld be full-rank design matrix. It is just part of X for subject i. Moreover all X_i should be the same width (num of columns) for Σ(X_i'V^-1X_i). Also with (y_i - X_i*β) you get residuals r_i (col num of X_i should be equal length β).
There is not very accurate description in Gurka's paper, but it provide good concept.
If you check, it should work.

ElMaestro
★★★

Denmark,
2020-08-07 15:08
(1759 d 23:23 ago)

@ PharmCat
Posting: # 21831
Views: 25,763

Still can't make it work

Post reply

Hi PharmCat,

thanks for trying to assist on this. :-)

It is kind of you.

❝ I think X_i is not shuld be full-rank design matrix. It is just part of X for subject i. Moreover all X_i should be the same width (num of columns) for Σ(X_i'V^-1X_i). Also with (y_i - X_i*β) you get residuals r_i (col num of X_i should be equal length β).

❝ There is not very accurate description in Gurka's paper, but it provide good concept.

❝ If you check, it should work.

I can't make it work.
In Gurka's equation 3, where we can write the shorthand beta= M₁^-1*M₂: When using the little 3-by-6 matrices for Xi, M₁ becomes non-invertible.
I tried to lump 6 subjects together at a time, and now I am getting an estimate of beta but it appears wrong, in the sense it is not corresponding to what I get for the full model.

I wonder if Gurka's equation only works under assumptions not mentioned in the paper or if I am just a very lousy programmer. My money is clearly on the latter :-D

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-07 17:42
(1759 d 20:49 ago)

@ ElMaestro
Posting: # 21833
Views: 26,056

Still can't make it work

Post reply

Hi ElMaestro!

❝ I tried to lump 6 subjects together at a time, and now I am getting an estimate of beta but it appears wrong, in the sense it is not corresponding to what I get for the full model.

If you calculate β with initial V - it will be wrong. It will be estimate for current V, then you calk REML end opimize θ... And then again... recalk β, ets... You can get initial β as β = inv(qr(X).R)*qr(X).Q'*y

But if you have wrong β at final iteration - it is bad case.

You can find same equation in LINDSTROM&BATES - 1.2 (where X not a design matix, but vector of X_i, IMHO)

If you provide me data from 6 subjects I can make calculations in Julia.

Magical linear algebra in julia:


using LinearAlgebra



X1 = [1 1 0 0 1; 1 0 0 0 1]


X2 = [1 1 0 0 1; 1 1 1 0 0]


Xv = Vector{Any}(undef, 2)

Xv[1] = X1

Xv[2] = X2

Xv'*Xv == X1'*X1 + X2'*X2 == vcat(X1, X2)'*vcat(X1, X2)

#true

ElMaestro
★★★

Denmark,
2020-08-07 18:44
(1759 d 19:47 ago)

@ PharmCat
Posting: # 21834
Views: 25,714

Still can't make it work

Post reply

Hi PharmCat.

❝ If you calculate β with initial V - it will be wrong.

Correct and that is not what I am doing. For every iteration through the optimizer I am deriving a new beta vector; this is beta given V. The loglikelihood is calculated on basis of V and beta, where beta is derived from V via X. Therefore LL can be said to be a function of V (or of the variance components in it).

There are two alternative ways I can get that beta vector for any given V:
1. I can do it the simple way with the very big X. It works well.
2. I can perhaps (=I should be able to?) do it by partition X into small bits, Xi, like in Gurka equation 3, or Lindstrom&Bates. But I cannot get the same beta vector for a given set of variance components in V.

The -or a- reason is that if I reduce X to Xi (individual subjects) and do all the summation then the M1 and written above is not invertible. This, in turn, appears to be because the individual Xi's are not of the same rank as X. If I lump Xi's together like the first 6 subjects, then the next 6 subjects etc, and so forth and use Gurka's equation 3, then I can generate a beta estimate (from 4 chunks of 6 subjects), but it is still not the same as in pt 1, for any given set of components in V.

I am not even at a final iteration, I am just at any iteration where I am comparing beta via the two methods for a set of starting values, or for that matter, if I plug in the values that come out as optimal values from pt 1 above after everything is coverged.

❝ It will be estimate for current V, then you calk REML end opimize θ... And then again... recalk β, ets... You can get initial β as β = inv(qr(X).R)*qr(X).Q'*y

You do not need this since you do not need starting values for beta, cf the LL above. You only need starter values for theta (variance components in V). Every pass through the REML LL equation generates a beta estimate from the variance compnents, given X. Once the algo has converge d the estimate of beta is final.
All this imaterial to my challenge at hand.

❝ But if you have wrong β at final iteration - it is bad case.

This is not about the "final iterattion". It is about any iteration. For any iteraiton, the two ways to derive beta results in different values. Therefore something s wrong.

❝ You can find same equation in LINDSTROM&BATES - 1.2 (where X not a design matix, but vector of X_i, IMHO)

❝ If you provide me data from 6 subjects I can make calculations in Julia.

Here all subjects are (y is logarithmised):

Subj Seq Trt Per y
1 RTR R 1 8.307361
1 RTR T 2 8.286622
1 RTR R 3 8.229191
2 RTR R 1 8.001757
2 RTR T 2 7.774351
2 RTR R 3 7.939016
3 RRT R 1 8.150295
3 RRT R 2 8.113786
3 RRT T 3 8.301224
4 TRR T 1 8.319961
4 TRR R 2 8.068152
4 TRR R 3 8.243703
5 TRR T 1 8.469640
5 TRR R 2 8.421255
5 TRR R 3 8.278936
6 RRT R 1 8.023159
6 RRT R 2 8.015393
6 RRT T 3 7.791358
7 RTR R 1 7.836054
7 RTR T 2 8.030084
7 RTR R 3 7.993823
8 TRR T 1 7.698483
8 TRR R 2 7.621391
8 TRR R 3 7.609862
9 RTR R 1 8.444106
9 RTR T 2 8.333174
9 RTR R 3 8.131531
10 TRR T 1 7.708949
10 TRR R 2 7.766586
10 TRR R 3 7.705803
11 RRT R 1 7.530373
11 RRT R 2 7.701833
11 RRT T 3 7.780888
12 RRT R 1 7.731229
12 RRT R 2 8.061613
12 RRT T 3 8.275682
13 RRT R 1 7.878686
13 RRT R 2 7.795811
13 RRT T 3 7.961789
14 RTR R 1 8.016582
14 RTR T 2 7.807835
14 RTR R 3 7.996452
15 RTR R 1 7.720639
15 RTR T 2 7.598299
15 RTR R 3 7.910003
16 TRR T 1 7.992809
16 TRR R 2 8.143808
16 TRR R 3 8.114504
17 TRR T 1 7.781890
17 TRR R 2 7.885856
17 TRR R 3 7.683404
18 RRT R 1 7.910224
18 RRT R 2 7.939373
18 RRT T 3 8.054078
19 RTR R 1 7.791027
19 RTR T 2 7.919283
19 RTR R 3 7.825645
20 RRT R 1 7.886983
20 RRT R 2 7.982689
20 RRT T 3 8.018691
21 RRT R 1 7.961928
21 RRT R 2 7.888485
21 RRT T 3 8.029107
22 TRR T 1 7.989221
22 TRR R 2 7.981665
22 TRR R 3 7.956967
23 TRR T 1 8.056680
23 TRR R 2 8.066396
23 TRR R 3 8.174308
24 RTR R 1 7.536257
24 RTR T 2 7.500419
24 RTR R 3 7.912350

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-07 20:14
(1759 d 18:17 ago)

(edited on 2020-08-08 01:12)
@ ElMaestro
Posting: # 21835
Views: 25,774

Still can't make it work

Post reply

❝ Here all subjects are (y is logarithmised)

# Theta estimate θ was taken by REML

using CSV, DataFrames, StatsModels, StatsBase



path = dirname(@__FILE__)

cd(path)



data   = CSV.File("dat.csv", delim=' ')|> DataFrame

#Sort data

sort!(data, [:Subj, :Trt, :Per])

#Get X matrix, Z, y

X = ModelMatrix(ModelFrame(@formula(0 ~ Seq + Per + Trt), data)).m

Z = ModelMatrix(ModelFrame(@formula(0 ~ 0 + Trt), data, contrasts = Dict(:Trt => StatsModels.FullDummyCoding()))).m

y   = data[!, :y]


#Xv vector of Xi, Zv vector of Zi, yv vector of yi

u = unique(data[!, :Subj])

Xv = Vector{Matrix}(undef, length(u))

Zv = Vector{Matrix}(undef, length(u))

yv = Vector{Vector}(undef, length(u))

for i = 1:length(u)

    v = findall(x -> x == u[i], data[!, :Subj])

    Xv[i] = view(X, v, :)

    Zv[i] = view(Z, v, :)

    yv[i] = view(y, v)

end



# Theta estimate θ[1:2] for R, θ[1:3] for G

# Very hard to take good θ estmate for this design

# If you provide your estimate, β can be recalculated



θ = [0.013246492714940418,

0.008891008058562478,

0.03621599611178057,

0.06160355780666301,

0.9661995154179528]


#G matrix

G = [θ[3] sqrt(θ[3]*θ[4])*θ[5]; sqrt(θ[3]*θ[4])*θ[5] θ[4]]



#Vector of R matrices

Rv = Diagonal.(map(x -> x * θ[1:2], Zv))



#Construct vector of Vi

Vv = Vector{Matrix}(undef, length(u))



for i = 1:length(u)



    global Vv[i] = Zv[i]*G*Zv[i]' + Rv[i]


end



#Vector of inverted Vi

iVv = inv.(Vv)



M1 = zeros(6, 6)

M2 = zeros(6)



#Calc M1 & M2

for i = 1:length(u)



    global M1 .+= Xv[i]'*iVv[i]*Xv[i]


    global M2 .+= Xv[i]'*iVv[i]*yv[i]


end



β = inv(M1) * M2



#=

julia> β

6-element Array{Float64,1}:

 7.904258681084915

 0.0547761264037151

 0.05092362547466389

 0.0012959346740553102

 0.048118895192829976

 0.02239133333333365

=#

I get beta:

7.904258681084915
0.0547761264037151
0.05092362547466389
0.0012959346740553102
0.048118895192829976
0.02239133333333365

ElMaestro
★★★

Denmark,
2020-08-07 20:23
(1759 d 18:08 ago)

@ PharmCat
Posting: # 21836
Views: 25,751

Still can't make it work

Post reply

Hi PharmCat,

it is kind of you, but this looks like apples compared to pears.

The whole point is that G and R are meaningless so I cannot compare to an output generated with a model involving G and R. I am doing V directly, because we do not work with within and betweens for T.
That is why you have five elements in theta (T is not replicated, but you are trying to fit both a within and a between variance for it). I am saying we do not work on them both, in stead we replace their sum by a single variance estimate. This is the whole point of the thread here.

Note also, I am not after correlations per se. Therefore I am writing the variances directly into the covariance matrix (perhaps like you are used to if you work with compound symmetry of the heterogenous type).

If you are looking at a final log likehood around 15.337 then I think we are at least talking about the same model.

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2020-08-07 23:31
(1759 d 15:00 ago)

@ ElMaestro
Posting: # 21837
Views: 25,803

And now it works

Post reply

I think I made beta work now, thanks a lot PharmCat :-D

It appears that beta can be constructed even if the local Xi is not of rank p; I guess it must be a little typo in Gurka's paper. Anyways, I reprogrammed the lines, and now I get conformable matrices, and beta is the same whichever way I calculate it.
May try and see if I can do some comparisons about the performance, I still need a lot of code to put it all into function.

Thanks PharmCat, pointing me towards Gurka and insisting on Xi was a great help :ok:

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-08 03:08
(1759 d 11:23 ago)

@ ElMaestro
Posting: # 21838
Views: 25,617

Still can't make it work

Post reply

Hi ElMaestro!

❝ That is why you have five elements in theta (T is not replicated, but you are trying to fit both a within and a between variance for it). I am saying we do not work on them both, in stead we replace their sum by a single variance estimate. This is the whole point of the thread here.

I understand your point. And generaly your variance component function is better because is not redundant. From other side if we have some different functions V = ƒ₁(θ₁) and V = ƒ₂(θ₂) it could be any and from any θ if we get equal V (so it could be CSH, FA0(2), manual construct).

❝ If you are looking at a final log likehood around 15.337 then I think we are at least talking about the same model.

I have 15.337299690629834, model is the same ;-)

it is good proof: different V construction methods lead to same results.

❝ I guess it must be a little typo in Gurka's paper.

From time to time I find inaccuracies even in good articles and sometime it give strong headake when you try directly reproduce some methods. log(Σ|X|) instead log(|ΣX|) can give hours of frustration. :crying:

❝ Thanks PharmCat, pointing me towards Gurka and insisting on Xi was a great help :ok: .

I am very glad if my thoughts were useful :clap:

ElMaestro
★★★

Denmark,
2020-08-08 14:25
(1759 d 00:07 ago)

@ PharmCat
Posting: # 21839
Views: 25,625

Speed improvement

Post reply

I am seeing about a 30% speed improvement on my machine when using arrays of matrices. Am using lists for that purpose, that's the only way I can think of.
Further improvement may be possible, because I may only need to store principally three different matrices in these arrays (per array) an keep track of which subjects use which matrices (that would be defined by their sequences), but this, I think, for now is not a priority for me to do.

A further option is to only use one kind of matrix and then reorder all periods in the global data list, so that all subjects have e.g. R-T-R, then we'd need to work with only a single matrix type in the profiles; this may improve speed further. Will think about it...

30%...It ain't much but it's honest work. :-)

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-08 19:27
(1758 d 19:04 ago)

@ ElMaestro
Posting: # 21840
Views: 25,609

Speed improvement

Post reply

❝ 30%...It ain't much but it's honest work. :-)

It is very good boost. I donn't know how matrix multipication works in R, but some improovments can be done if make function A = B'*C*B or A += B'*C*B. At each step of this function there are many intermediate products that can be avoided with direct computation. (Or make it in C)

Example incremantal function:


"""

A' * B * A -> +θ (cache)

A' * B * C -> +β

"""

function mulθβinc!(θ, β, A::AbstractMatrix, B::AbstractMatrix, C::Vector)

    q = size(B, 1)

    p = size(A, 2)

    c = zeros(eltype(B), q)

    for i = 1:p

        c .= 0

        for n = 1:q

            for m = 1:q

                @inbounds c[n] += B[m, n] * A[m, i]


            end

            @inbounds β[i] += c[n] * C[n]


        end

        for n = 1:p

            for m = 1:q

                @inbounds θ[i, n] += A[m, n] * c[m]


            end

        end

    end

    θ, β

end



function mulsimple!(θ, β, A::AbstractMatrix, B::AbstractMatrix, C::Vector)

    AB = A'*B

    θ .+ AB*A, β .+ AB*C

end



θ = zeros(4, 4)

β = zeros(4)



A = [1 3 4 3; 2 3 2 3; 3 4 5 6; 3 4 5 6]
B = [1 3 4 3; 2 3 2 3; 3 4 5 6; 3 4 5 6]
C = [1, 3, 4 ,3]

julia> @time mulsimple!(θ, β, A, B, C);
0.000007 seconds (10 allocations: 1.172 KiB)

julia> @time mulθβinc!(θ, β, A, B, C);
0.000004 seconds (2 allocations: 144 bytes)

eightfold memory savings

ElMaestro
★★★

Denmark,
2020-08-08 20:10
(1758 d 18:21 ago)

@ ElMaestro
Posting: # 21841
Views: 25,609

Speed improvement

Post reply

Hi again,

now it gets really interesting.
I made two observations:

1. There is this 30% improvement if I switch to profile log likelihood in stead of "the big one" for lac kof betyter wording
2. I can achieve about an improvement of several thousand percent if stay with the big X and V, however, I must then use a smart tactic:
I am only generating V once, but though each pass on the optimiser I am updating V rather than re-writing it.
Instead of iterating across all rows and column and checking in the data list where to put the individual components, I am initially generating a list of coordinates for each variance component. Something like this:

V.Coordinates=function(Data)

{

 Nobs=nrow(Data)

 xy.varT=NULL; xy.varBR=NULL; xy.varWR=NULL; xy.varBRT=NULL; 



 for (iRow in 1:Nobs)

 for (iCol in 1:Nobs)

  {

   ## the diagonal

   if (iCol==iRow)

    {

      if (Data$Trt[iRow]=="T") xy.varT=rbind(xy.varT, c(iRow, iCol))

      if (Data$Trt[iRow]=="R") xy.varWR=rbind(xy.varWR, c(iRow, iCol))

    }

   ## off diagonal

   if (iCol!=iRow)

   if (Data$Subj[iRow]==Data$Subj[iCol])

    {

     if (Data$Trt[iCol]==Data$Trt[iRow]) xy.varBR=rbind(xy.varBR, c(iRow, iCol))

     if (Data$Trt[iCol]!=Data$Trt[iRow]) xy.varBRT=rbind(xy.varBRT, c(iRow, iCol))

    }

  }

 return(list(xy.varT=xy.varT, xy.varBR=xy.varBR, xy.varWR=xy.varWR, xy.varBRT=xy.varBRT))

}

This returns a list with four components. The first component is a set of coordinates in V for varT, the second is a set of coordinates for varBR. And so forth.

Then, in the REML function, I am having a block like this:


  for (i in 1:nrow(VC$xy.varT))

   V[VC$xy.varT[i,1], VC$xy.varT[i,2]] = Pars[1]


  for (i in 1:nrow(VC$xy.varBR))

   if (VC$xy.varBR[i,1] != VC$xy.varBR[i,2])

    V[VC$xy.varBR[i,1], VC$xy.varBR[i,2]] = Pars[2]


  for (i in 1:nrow(VC$xy.varWR))

   V[VC$xy.varWR[i,1], VC$xy.varWR[i,2]] = Pars[3]+Pars[2]


  for (i in 1:nrow(VC$xy.varBRT))

   V[VC$xy.varBRT[i,1], VC$xy.varBRT[i,2]] = Pars[4]

And now are simply talking REAL BUSINESS :-D

The optimiser is returning blazingly fast with an REML estimate and corresponding variance components. Depending on the tolerance required it is almost instantaneous (<0.5 secs) on my laptop and that machine isn't a high tech equipment at all. The speed with "big V" far outruns the profile approach.
I believe this implies that matrix inversion is done in a clever fashion in R's solve function; as you pointed out PharmCat inversion can be slow (and unstable) as matrices get large, but perhaps the LAPACK routine used by R reduces the task intelligently, possibly through some of the same tricks described for e.g. the Alglib routines..

However, I am also seeing a different little thing:
If I optimise with the profile approach I can make BFGS converge, but this will not work if I try it it in the "big V" approach. Lindstrom and Bates noticed something similar in the sense that they wrote that Quasi-Newton methods perform better when the profile approach is used:
"We recommend optimizing the profile log- likelihood, since it will usually require fewer iterations, the derivatives are somewhat simpler, and the convergence is more consistent. We have also encountered examples where the NR algorithm failed to converge when optimizing the likelihood including a but was able to optimize the profile likelihood with ease."

So, lots of options and I am not going to be late for supper :-D

—
Pass or fail!
ElMaestro

PharmCat
★

Russia,
2020-08-09 20:22
(1757 d 18:09 ago)

@ ElMaestro
Posting: # 21844
Views: 25,308

Speed improvement

Post reply

❝ I believe this implies that matrix inversion is done in a clever fashion in R's solve function; as you pointed out PharmCat inversion can be slow (and unstable) as matrices get large, but perhaps the LAPACK routine used by R reduces the task intelligently, possibly through some of the same tricks described for e.g. the Alglib routines..

It is very good question to explore. If we suppose that some routine knows something about structure it should make big work to recognize block-diagonal or/and symmetric matrix structure.

I think, main cause is a overhead for each function call. Another problem - many intermediate products for %*%

PharmCat
★

Russia,
2020-08-10 13:48
(1757 d 00:43 ago)

@ ElMaestro
Posting: # 21849
Views: 25,252

Some tests...

Post reply

Hi ElMaestro,

❝ now it gets really interesting.

I made some tests. And results is surprising.

First code:


using BlockDiagonals

using LinearAlgebra

using BenchmarkTools

Am = rand(150, 150)

Bm = rand(150, 150)

@rput Am

@rput Bm

blocks  = [rand(3,3) for i in 1:50]


BDm     = BlockDiagonal(blocks)

V       = Matrix(BDm)

@rput V

b1 = @benchmark R"solve(V)"

b2 = @benchmark R"solve(Bm)"

b3 = @benchmark inv(Bm)

b1 1.109ms VS b2 3.751ms , so R solve block-diagonal matrix more than twice faster.
In Julia b3 2.707 ms: random matrix solving in Julia faster than in R, but more slower then b1 when matrix is block-diagonal.

Second code:

using RCall

using BlockDiagonals

using LinearAlgebra

using BenchmarkTools

blocks  = [rand(3,3) for i in 1:50]


BDm     = BlockDiagonal(blocks)

V       = Matrix(BDm)

@rput V

@rput blocks

b1 = @benchmark inv(V)

b2 = @benchmark inv.(blocks)

b3 = @benchmark inv(BDm)

b4 = @benchmark R"solve(V)"

b5 = @benchmark R"for (i in 1:50) {solve(blocks[[i]])}"

Solving by blocks in R (b5 3.275 ms) really slower (a little more faster than solving random matrix).

But solving block-diagonal matrix in Julia when structure is described takes 56.484 μs (b3 in second code), this is 20 times faster than in R.

I'm trying to discuss this in Julia forum here.

I'am not so expirienced in R, but I think that Solve.block can increase performance for block-diagonal matrices, docs here.

Semireplicated + REML in R [🇷 for BE/BA]

Semireplicated + REML in R

Semireplicated + REML in R

Avoid partial replicate designs, pleeeze!

Avoid partial replicate designs, pleeeze!

Braveheart!

Who can help?

Who can help?

Update II

Update III

Final update today

Medium rare.

took just 52 hrs to do it :-)

Will take much more hours still…

Negative determinant

Are we loosers?

"we"? Loosers?

Misunderstanding?

Braveheart!

Braveheart!

Braveheart!

Braveheart!

We were all blind (except Detlew)

It is the opposite way around for me

Desultory thoughts

FDA RSABE is ISC

FDA RSABE is ISC

Desultory thoughts

Desultory thoughts

Phoenix - which template?

FDA RSABE Project template_ v1.4.phxproj

"By popular demand": likelihood

And by the way....

And by the way....

Not understood

Not understood

Not understood

Not understood

Open issues

Open issues

Open issues

Open issues

Still can't make it work

Still can't make it work

Still can't make it work

Still can't make it work

Still can't make it work

And now it works

Still can't make it work

Speed improvement

Speed improvement

Speed improvement

Speed improvement

Some tests...