Bioequivalence and Bioavailability Forum

Validation of nonlinear mixed-effects software [Software]

posted by Helmut – Vienna, Austria, 2015-12-30 20:56 (3478 d 11:15 ago) – Posting: # 15785
Views: 5,760

Dear all,

inspired by this thread some comments.
Nonlinear mixed-effect models are complicated. Really! Essentially the optimizer in the software tries to find the global minimum of the objective function (OF). Once the minimum is found, the coordinates are the “best” parameter estimates. Sounds easy. But: The OF is a hypersurface in an n+1-dimensional parameter-space (n = number of parameters in the model). To complicate things, coordinates of the n+1-space can be continuous (PK/PD parameters, covariates like weight, age, …), discrete (time to event, …), categorial (disease state, …), or even dichotomous (sex, …). Only for some basic PK/PD models closed solutions are possible. In all other cases partial derivatives have to be approximated numerically. Many clever algorithms exist: Parametric (Gauß-Newton, Newton-Raphson, Levenberg-Hartley, …), semiparametric (grid/simplex searches like Nelder-Mead), or nonparametric (Monte Carlo Markov Chains, …).

There are many traps set for our poor optimizer:

The hypersurface is “flat”. There might be even a gradient towards a minimum but it is below the numeric resolution of the machine. The optimizer wanders around for a while (or hours) only to come to the conclusion that a minimum doesn’t exist. That’s the nasty no-convergence situation we all hate.
Several minima exist which cannot be resolved at the numeric precision. Rare, but especially nasty. Which one to pick?
Happened to me once. After months (!) of work my final result was this: Depending on starting values I got three solutions which were practically indistinguishable (p-values close to 0.9, AICs differed at the fourth digit). In technical terms, the model was not robust. The sponsor took it with good humor in the tradition of Edison:
I have not failed 700 times. I have not failed once.
I have succeeded in proving that those 700 ways will not work.
If we are lucky we can rule some solutions out because their estimates are physiologically not plausible.
With the given starting values a local minimum instead of the global one is reached. Our aim is the black hole in the center of the milky way only to find our spaceship sucked into the gravity well of Betelgeuse. See the wonderful example of the NIST.

There are many strategies to overcome these obstacles.

If you write you own models as differential equations (easy) try to solve them (difficult). Ask someone familiar with Laplace transforms for help. An explicit formula is always numerically more stable than the approximation by differences.
Support the optimizer with best guess estimates as starting values. NCA comes handy. Don’t plug in V/ƒ and CL/ƒ of an extravascular administration from NCA. That’s meaningless since ƒ is unknown. Check the literature for ƒ.
Start with a simplex search and feed the estimates to a parametric engine.
In PopPK/PD start with naïve pooling or FO. Estimates are biased but give reasonable initial estimates for other algos.
There are many algos in the wild. Start with First Order Conditional Estimates (FOCE), then maybe FO Extended Least Squares (FOELS) or FO Lindstrom-Bates. If you know what you are doing (!) proceed with fancy stuff like Quasi-Random Parametric Expectation Maximization (QRPEM) or Iterated 2-Stage Expectation Maximization (IT2S-EM).
Watch out for local minima. Keep one parameter fixed and try a set of initial estimates of the others. Do you end at the same final estimate(s)?
Pay close attention to residual plots. Any pattern/trend? Funnel shape (a sign of heteroscedasticity)? Play around with weighting schemes. There are a lot. Additive, multiplicative, a mixture of both, power functions. Adds another dimension of complications. You can base the weighting on observed values (C) or on predicted ones (Ĉ). The latter is called iteratively reweighted least squares (IRLS). Useful if you trust the model more than the data. If you are courageous write your own weighting function.
How will you deal with concentrations <LLOQ? No, ignoring them is not a good idea. If your software supports that, include these values in the model. Essentially you are telling the optimizer: “This sample is post-dose so I’m pretty confident that something is in there. I could not measure it. I only know that it is >0 and <LLOQ. Please take that into account. THX.”
How will you deal with missings? Some software make a difference whether you provide no result for a time point or leave the entire row out.
Use meaningful parameter constraints. No, volumes of distribution and clearances can’t be negative. Not even zero. If you have adults in the study consider setting the lower constraint for body weight to ~40 kg and the range for age to ~18–100 a.
How will you introduce the random effects (covariates) into the model? Proportional, additive, mixture? Random effects on which fixed effects? Body weight on volume(s) of distribution and age on clearance are bank bets.
Many strategies exist for introducing random effects. At the end of the day, keep it as simple as possible but as complicated as necessary (Einstein).
Does it make sense to center body weight at the x̃ of volunteers in the study? 70 kg for males might give a worse fit but at the end you want to extrapolate to the population, right?
Compare models by any of: Minimum AIC, Minimum BIC, or run F-tests. Restrict yourself to one. Don’t pick out the best.

After sleepless nights, Schützomycin, and a lot of trial an error you get something you are happy with. But wait! This is not a hobby project, you have to submit the stuff to an agency. Are there some guidelines to observe? Oops.

FDA (1999)
EMA (2007)

While reading these gems you promise to yourself that the next time you will read them before you start modeling. So much to do: Visual prediction checks, bootstrapping, robustness, training and validation sets, internal and external validation, … You feel dizzy. Time for another pill.
OK. You slept over it. Learning curve. Next time you will do better. You read the GLs once more. Do they speak about cross-validation (i.e., comparing results obtained by different software)? Not a single word. That’s interesting (and related to the thread I mentioned in the beginning).
In 2005 Pascal Girard and France Mentré had a splendid idea. Generate data sets from a known PK model (one-compartment with first order absorption), and spice the data with error from a known distribution. Send the data sets to (very!) experienced pharmacometricians who would try to come up with estimates in different software. Initial starting values for fixed effects were provided. The outcome was surprising. Some participants would have used a different PK-model. In this case they were provided with the true one. See the original presentation¹ and have a look at the bias of estimates.

Given that, I’m extremely skeptical whether it is possible to reproduce NLME-results of software A in software B.

The title of the first chapter of Peter Bonate’s excellent textbook² is “The Art of Modeling”. All too true!

Girard P, Mentré F. A comparison of estimation methods in nonlinear mixed effects models using a blind analysis. PAGE Meeting: Pamplona, Spain (16–17 June 2005). free resource.
Bonate PL. Pharmacokinetic-Pharmacodynamic Modeling and Simulation. New York, Springer: 2nd ed 2011. doi:10.1007/978-1-4419-9485-1.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Validation of nonlinear mixed-effects softwareHelmut 2015-12-30 19:56 [Software]

Validation of non­linear mixed-effects soft­ware [Software]

Complete thread:

Validation of nonlinear mixed-effects software [Software]