Helmut
★★★

Vienna, Austria,
2021-08-27 15:51
(20 d 05:38 ago)

Posting: # 22550
Views: 643

Dear all (and esp. ElMaestro),

with great interest I read two recent papers.1,2

On the author posted:
A short time back I had a paper published showing that a major issue with current statistical software is that the model is mis-specified when one uses a mixed model to evaluate a semi-replicate BE trials. […]

One of the conclusions drawn is that if we want to evaluate the data with a mixed model without running into model/convergence issues then we need to specify the covariance matrix $$\small{V }$$ without using $$\small{V = ZGZ^t + R}$$.

Even so there is still no recipe for deriving the confidence interval. Hence, the options with the mixed model are:
1. Work with the wrong model and the software may not converge.
2. Work with the right model but major stats packages do not support it, and
they do not derive a CI for T/R.
That is an agonizing choice. This has huge practical implications.

I absolutely agree with this statement.
However, what can we do? We need the bloody confidence interval. At the end of the day we are  left out in the rain if we are dealing with a partial replicate design and the study has to be evaluated for average bioequivalence (according to the ’s wishful thinking if $$\small{s_\textrm{wR}<0.294}$$).
To obtain the variance components which are estimable and the point estimate is just not enough. As long as we can’t get the standard error of the treatment difference and its associated degrees of freedom, we are at a loss. Hence, I repeat my mantra:

Avoid the partial replicate design
if intended for RSABE and if
there is even the slightest chance that you have to assess it for ABE.

Of note: I used the supplementary script and got for

res <- Run.REML() PE  <- 100 * exp(res$Fixed.eff[["TrtT", 1]] - res$Fixed.eff[["TrtR", 1]]) print(res); cat("PE", PE, "\n") $Outcome [1] "The model converged."$LogLikelihood [1] 15.33728 $Fixed.eff [,1] TrtT 8.02569184 TrtR 8.00330042 Per1 -0.04811877 Per2 -0.04682231 SeqRTR 0.00385480 SeqRRT -0.05092437$Random.eff   Var.Component  Ini.guess REML.Estimate 1          varT 0.06841615    0.07049570 2         varBR 0.03576077    0.03621596 3         varWR 0.01240137    0.01324664 4        varBRT 0.03576077    0.04563785 PE 102.2644

Pretty close to the modified script we discussed last year (given at the end). Much faster than mine.

Nelder-Mead optimization (31.6 seconds)   LogLikelihood 75.98722   Evaluations   239   Successful completion.    Component    Initial   Estimate    sigma²(T) 0.06841615 0.07049451   sigma²(bR) 0.03576077 0.03621602   sigma²(wR) 0.01240137 0.01324648  covar(bTbR) 0.05208846 0.04563728  Parameter  Estimate   sigma(T) 0.2655080      CV(T)   27.026%  sigma(bR) 0.1903051     CV(bR)   19.204%  sigma(wR) 0.1150934     CV(wR)   11.548%         PE 102.2644%

1. Fuglsang A. Professor Endrenyi’s Legacy: An Evaluation of the Regulatory Requirement “Fixed Effects, Rather Than Random Effects, Should Be Used for All Terms”. J Pharm Pharm Sci. 2021; 24: 413–20. doi:10.18433/jpps31872. Open access.
2. Fuglsang A. Mitigation of the convergence issues associated with semi-replicated bioequivalence data. Pharm Stat. 2021. Early View. doi:10.1002/pst.2142.

########################### # Section 1: Housekeeping # ########################### library(PowerTOST) library(TeachingDemos) GetData <- function(path, sep, dec) {   # read data from a CSV-file   if (missing(sep)) sep <- ","   if (!sep %in% c(",", ";", "t"))     stop("Column separator must be any of ',', ';', 't'.")   if (sep == "t") sep <- "\t"   if (missing(dec)) dec <- "."   if (!dec %in% c(".", ","))     stop("Decimal separator must be '.', or ','.")   # strip quotes and skip eventual commentary lines   D        <- read.csv(path, sep = sep, dec = dec,                        quote = "", comment.char = "#")   names(D) <- c("Subj", "Per", "Seq", "Trt", "Y")   cols     <- c("Subj", "Per", "Seq", "Trt")   D[cols]  <- lapply(D[cols], factor)   return(invisible(D)) } CreateX <- function(D) {   # first the two treatments   TrtT <- as.numeric(as.character(D$Trt) == "T") TrtR <- as.numeric(as.character(D$Trt) == "R")   X <- cbind(TrtT, TrtR)   R <- qr(X)$rank # now the periods for (p in unique(D$Per)) {     v  <- as.numeric(D$Per == p) XX <- data.frame(X, v) names(XX)[ncol(XX)] <- paste0("P", p) rnk <- qr(XX)$rank     if (rnk > R) {       X <- XX       R <- rnk     }   }   for (q in unique(D$Seq)) { v <- as.numeric(D$Seq == q)     XX <- data.frame(X, v)     names(XX)[ncol(XX)] <- paste0("Q", q)     rnk <- qr(XX)$rank if (rnk > R) { X <- XX R <- rnk } } return(as.matrix(X)) } Create.CovM <- function(Params) { # block diagonal covariance matrix varT <- Params[1] varBR <- Params[2] varWR <- Params[3] varRT <- Params[4] Nobs <- length(D$Y)   V     <- matrix(0, ncol = Nobs, nrow = Nobs)   for (iRow in 1:Nobs) {     for (iCol in 1:Nobs) {       if (iCol == iRow) { # diagonal elements         if (D$Trt[iRow] == "T") { V[iRow, iCol] <- V[iRow, iCol] + varT } else { V[iRow, iCol] <- V[iRow, iCol] + varWR + varBR } } else { # off diagonal elements if (D$Subj[iRow] == D$Subj[iCol]) { if (D$Trt[iCol] == D$Trt[iRow]) { V[iRow, iCol] <- V[iRow, iCol] + varBR } else { V[iRow, iCol] <- V[iRow, iCol] + varRT } } } } } return(V) } ############################ # Section 2: Matrix things # ############################ Obj.F12 <- function(Pars) { j <<- j + 1 int[[j]] <<- Pars # this is line 3 of page 10 of: # http://people.csail.mit.edu/xiuming/docs/tutorials/reml.pdf CovM <- Create.CovM(Pars) A <- -0.5*log(det(CovM)) # Note: NaNs can be proceded in the next line. B <- suppressWarnings(-0.5*log(det(t(X) %*% solve(CovM) %*% X))) est.b <- solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y tmp <- y - X %*% est.b C <- -0.5 *(t(tmp) %*% solve(CovM) %*% tmp) C <- as.numeric(C) # make a scalar for other solvers return(A + B + C) } Final <- function(Pars) { CovM <- Create.CovM(Pars) tmp <- solve(t(X) %*% solve(CovM) %*% X) %*% t(X) %*% solve(CovM) %*% y PE <- exp(as.numeric(tmp["TrtT", ] - tmp["TrtR", ])) return(PE) } ######################## # Section 3: Execution # ######################## Some.Initial.Guesses <- function(check = FALSE) { # design (all possible ones, error handling, full or partial) subjs <- unique(as.numeric(D$Subj))   trts  <- sort(unique(as.character(D$Trt)), decreasing = TRUE) if (!length(trts) == 2) stop("Only two treatments supported.") if (sum(!trts %in% c("T", "R")) !=0) stop("treatments must be coded as 'T' and 'R'.") seqs <- sort(unique(as.character(D$Seq)), decreasing = TRUE) # T first   type  <- paste(seqs, collapse = "|") # identifier   nseq  <- length(unique(as.character(D$Seq))) nper <- length(unique(as.numeric(D$Per)))   nsubj <- length(subjs)   if (nchar(type) == 19) {  # 4-period 4-sequence full replicate designs     if (nper != 4) stop("4 periods required in this full replicate design.")     if (nseq != 4) stop("4 sequences required in this full replicate design.")     des <- "full"   }   if (nchar(type) == 9) {  # 4-period full replicate designs     if (nper != 4) stop("4 periods required in this full replicate design.")     if (nseq != 2) stop("2 sequences required in this full replicate design.")     des <- "full"   }   if (nchar(type) == 7) {  # 3-period replicates     if (type %in% c("TRT|RTR", "TRR|RTT")) {       if (nper != 3) stop("3 periods required in this full replicate design.")       if (nseq != 2) stop("2 sequences required in this full replicate design.")       des <- "full"     }     if (type == "TRR|RTR") {       if (nper != 3) stop("3 periods required in the extra-reference design.")       if (nseq != 2) stop("2 sequences required in the extra-reference design.")       des <- "partial"     }   }   if (nchar(type) == 11) { # Balaam's design or 3-sequence partial     if (!type == "TR|RT|TT|RR") { # Balamm's       if (nper != 3) stop("3 periods required in this partial replicate design.")       if (nseq != 3) stop("3 sequences required in this partial replicate design.")       des <- "partial"     } else {                      # TRT|RTR|RRT       if (nper != 2) stop("2 periods required in Balaam's design.")       if (nseq != 4) stop("4 sequences required in Balaam's design.")       des <- "full"     }   }   if (type == "TR|RT") stop("TR|RT design not supported yet.")   gm.T <- gm.R <- numeric(nsubj) # vector of geometric means   for (subj in seq_along(subjs)) {     gm.T[subj] <- mean(log(D$Y[which(D$Subj == subj & D$Trt == "T")])) gm.R[subj] <- mean(log(D$Y[which(D$Subj == subj & D$Trt == "R")]))   }   # remove NAs   gm.T <- gm.T[!is.na(gm.T)]     gm.R <- gm.R[!is.na(gm.R)]     if (des == "partial") { # guess varT (between + within)     m    <- lm(log(Y) ~ Seq + Per, data = D[D$Trt == "T", ]) varT <- anova(m)["Residuals", "Mean Sq"] } else { # guess VarWT m <- lm(log(Y) ~ Seq + Per + Subj, data = D[D$Trt == "T", ])     varWT <- anova(m)["Residuals", "Mean Sq"]   }   # guess VarWR   m     <- lm(log(Y) ~ Seq + Per + Subj, data = D[D$Trt == "R", ]) varWR <- anova(m)["Residuals", "Mean Sq"] # guess varBR m <- lm(log(Y) ~ Seq + Per, data = D[D$Trt == "R", ])   varBR <- anova(m)["Residuals", "Mean Sq"] - varWR   # TODO: full replicates   varRT <- 0.5 * (varT + varBR)   rslt  <- c(varT, varBR, varWR, varRT)   return(rslt) } MyREML <- function(maxit = 1000, method = "Nelder-Mead") {   Ini <- Some.Initial.Guesses()   ptm <- proc.time()[[3]]   F   <- optim(par = Ini, fn = Obj.F12, method = method,                control = list(reltol = 1E-12, maxit = maxit,                fnscale = -1))   et  <- proc.time()[[3]] - ptm   Nms <- c("var_T", "var_bR", "var_wR", "covar_bTbR")   X   <- data.frame(Nms, Ini, F$par) names(X) <- c("Component", "Initial", "Estimate") if (F$convergence == 0)  msg <- "Successful completion."   if (F$convergence == 1) msg <- paste("Iteration limit", maxit, "reached.") if (F$convergence == 10) msg <- "Degeneracy of the Nelder–Mead simplex."   cat(paste0("\nNelder-Mead optimization (", signif(et, 3), " seconds)"),              "\n  LogLikelihood", F$value, "\n Evaluations ", F$counts[["function"]], "\n ", msg, "\n")   return(X) } # Data has to have five columns in this order: #   Subject (numeric), Period (numeric), Sequence (character), #   Treatment (character), Response (numeric) # Column separator 'sep' comma (default), semicolon, or t (tab) # Decimal separator 'dec' period (default) or comma path <- "https://bebac.at/downloads/ds02.csv" # EMA partial replicate D    <- res <- PE <- int <- NULL # nullify eventual previous data and results D    <- GetData(path) X    <- CreateX(D) y    <- log(D$Y) # capture intermediate results: # https://stackoverflow.com/questions/23975101/getting-more-details-from-optim-function-from-r j <- 0 int <- list() # the workhorse # optional arguments: # maxit = X to reduce the number of iterations from the default 1000 # plot = TRUE for a plot of the progress of optimization # print = TRUE to print the progress of optimization res <- MyREML() res[, 1] <- c("sigma\u00B2(T)", "sigma\u00B2(bR)", "sigma\u00B2(wR)", "covar(bTbR)") fin <- data.frame(Parameter = c("sigma(T)", "CV(T)", "sigma(bR)", "CV(bR)", "sigma(wR)", "CV(wR)", "PE"), Estimate = NA_real_) for (j in 1:3) { fin$Estimate[j * 2 - 1] <- sprintf("%.7f", sqrt(res[j, 3]))   fin$Estimate[j * 2] <- sprintf("%3.3f%%", 100 * mse2CV(res[j, 3])) } fin$Estimate[7] <- sprintf("%3.4f%%", 100 * Final(res$Estimate)) int <- data.frame(matrix(unlist(int), nrow = length(int), byrow = TRUE)) if (is.null(attr(dev.list(), "names"))) windows(width = 6.5, height = 4) # variance components lbls <- c(expression(sigma[T]^2), expression(sigma[bR]^2), expression(sigma[wR]^2), expression(italic(cov)(bTbR))) cols <- c("darkgreen", "blue", "magenta", "red") op <- par(no.readonly = TRUE) par(mar = c(3.3, 3.1, 1.6, 4.1), mgp = c(2.2, 0.5, 0), lend = 2, ljoin = 1, tcl = 0.25, cex.axis = 0.8) plot(1:nrow(int), int$X1, type = "n", ylim = range(int), log = "x",      xlab = "Iteration", ylab = "Estimate", main = "Progress of optimization",      cex.lab = 0.95, cex.main = 1, font.main = 1, axes = FALSE) grid() axis(1) axis(2, las = 1) box() ax <- axTicks(1, log = TRUE) for (j in 1:4) {   lines(1:nrow(int), int[, j], type="s", lwd = 2, col = cols[j])   shadowtext(sqrt(2), int[1, j], cex = 0.65, pos = 3,              labels = signif(int[1, j], 4), col = cols[j],              bg = "white", r = 0.2)   shadowtext(nrow(int)*0.9, tail(int[, j], 1), cex = 0.65, pos = 3,              labels = signif(tail(int[, j], 1), 4), col = cols[j],              bg = "white", r = 0.2)   mtext(lbls[j], 4, at = tail(int[, j], 1), cex = 0.8, line = 0.5, las = 1) } par(op) print(res, row.names = FALSE); cat("\n"); print(fin, row.names = FALSE)

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Denmark,
2021-08-27 22:18
(19 d 23:12 ago)

@ Helmut
Posting: # 22552
Views: 541

Hi Hötzi,

» We need the bloody confidence interval.
» As long as we can’t get the standard error of the treatment difference and its associated degrees of freedom, we are at a loss.

If we look at SAS documentation e.g. here and here, then
1. I do not understand the underlying math (just try and look into the spectral decomposition things mentioned as part of DDFM derivation; it sends my head spinning).
2. Whatever SAS does, it seems to require that we work in V through ZGZt+R in order to estimate CI's for the fixed effects.
It would be interesting to see someone who has solid understanding of matrix theory and stats to work out the equations when the between and within-variability for T cannot be separated.
However, since FDA now threw the mixed model out for semi-replicate designs, interesting may in reality just mean academically attractive but possible not too useful at this moment. But who knows - until someone does it and explores it, it is not known if there is an advantageous property hidden somewhere.

Pass or fail!
ElMaestro
Helmut
★★★

Vienna, Austria,
2021-08-27 23:54
(19 d 21:36 ago)

@ ElMaestro
Posting: # 22553
Views: 542

## Misunderstanding?

Hi ElMaestro,

» If we look at SAS documentation e.g. here and here, then
» 1. I do not understand the underlying math (just try and look into the spectral decomposition things mentioned as part of DDFM derivation; it sends my head spinning).

Wooaah!!

» 2. Whatever SAS does, it seems to require that we work in V through ZGZt+R in order to estimate CI's for the fixed effects.

Seems so.

» It would be interesting to see someone who has solid understanding of matrix theory and stats to work out the equations when the between and within-variability for T cannot be separated.

Definitely not me…

» However, since FDA now threw the mixed model out for semi-replicate designs, …

What on earth gives you this impression? I’m talking about ABE (Stats guidance Appendix E, ANDA guidance page 29):

The following codes are an example of the determination of unscaled average BE for LAUCT with a partially replicate 3-way BE design:

PROC MIXED   data=pk;   CLASSES SEQ SUBJ PER TRT;   MODEL LAUCT = SEQ PER TRT/ DDFM=SATTERTH;   RANDOM TRT/TYPE=FA0(2) SUB=SUBJ G;   REPEATED/GRP=TRT SUB=SUBJ;   ESTIMATE 'T vs. R' TRT 1 -1/CL ALPHA=0.1;   ods output Estimates=unsc1;   title1 'unscaled BE 90% CI - guidance version';   title2 'AUCt'; run; data unsc1;   set unsc1;   unscabe_lower=exp(lower);   unscabe_upper=exp(upper); run;

If reference-scaling is acceptable (right branch), even a spreadsheet (‼) would do.

» … interesting may in reality just mean academically attractive but possible not too useful at this moment.

Nope. Talk to John about his data sets which didn’t converge. Given, FA0(1) did – with a warning – but I think there is a data set posted at Certara’s forum where nothing worked. Study done, no result. THX, .

» But who knows - until someone does it and explores it, it is not known if there is an advantageous property hidden somewhere.

Might be.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Denmark,
2021-08-28 10:00
(19 d 11:30 ago)

@ Helmut
Posting: # 22554
Views: 505

## Misunderstanding?

Hi Hötzi,

» » However, since FDA now threw the mixed model out for semi-replicate designs, …
»
» What on earth gives you this impression? I’m talking about ABE (Stats guidance Appendix E, ANDA guidance page 29):

Wow, that is actually right.
Page 27: "PROC GLM should be used for partially replicate (3-way) BE studies"
Page 29: "For PK parameters with a sWR < 0.294, use the unscaled average BE approach." (which implies PROC MIXED in their code)

So, we must start to work out sWR using the equations. Once we got it, we know whether we need a mixed model or not for the evaluation of BE. We may need the mixed model for one metric, like AUCt and AUCinf, and the equations (ref scaled ABE) for another, like Cmax.
I guess this is also the gist of the decision tree you gave. I wonder if this is truly their intention. The mixed model therefore seems to have been saved by the bell. This really emphasizes the need for a CI solution that allows V estimated without the existence of Z and R separately.

I am curious now - we should make a little study on the type I error in case of missing ref values.

Pass or fail!
ElMaestro
Helmut
★★★

Vienna, Austria,
2021-08-28 10:38
(19 d 10:52 ago)

@ ElMaestro
Posting: # 22555
Views: 491

## Here we are

Hi ElMaestro,

» » What on earth gives you this impression? I’m talking about ABE (Stats guidance Appendix E, ANDA guidance page 29):
»
» Wow, that is actually right.
» Page 27: "PROC GLM should be used for partially replicate (3-way) BE studies"

And Page 32:
• PROC MIXED should be used for fully replicate (4-way) BE studies
Only the knows why.

» Page 29: "For PK parameters with a sWR < 0.294, use the unscaled average BE approach." (which implies PROC MIXED in their code)

Now you got it. Some stuff there.

» So, we must start to work out sWR using the equations.

Yep. At least that’s easy.

» Once we got it, we know whether we need a mixed model or not for the evaluation of BE. We may need the mixed model for one metric, like AUCt and AUCinf, and the equations (ref scaled ABE) for another, like Cmax.

Correct.

» I guess this is also the gist of the decision tree you gave. I wonder if this is truly their intention.

That’s the interpretation of all authors who assessed the Type I Error. Sent you Mehl.

» The mixed model therefore seems to have been saved by the bell.

🥊

» This really emphasizes the need for a CI solution that allows V estimated without the existence of Z and R separately.

Happy that finally I conveyed the message.

» I am curious now - we should make a little study on the type I error in case of missing ref values.

Oh dear, reaching for the stars! So far we have no clue how to come up with a solution for complete data. Appetizer.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes