More answers [General Sta­tis­tics]

posted by Helmut Homepage – Vienna, Austria, 2019-11-18 14:09 (434 d 15:30 ago) – Posting: # 20821
Views: 3,890

Hi Victor,

» » Edit: after a quick experiment (click here to see screenshot), it seems that the “\(\mathcal{A}\)” I used was a UTF-8 character after all? ⊙.☉

Correct, and the mystery is resolved.
bin 11110000 10011101 10010011 10010000
hex F09D9390
When I paste the character it is shown in the preview but not in the post. The field of the database-table is of type utf8_general_ci, supporting only characters with a length of 3 bytes, whereas yours has 4. That’s it. In the development forum (in German, sorry) we realized that changing the type to utf8mb4_general_ci (neither of the field, the table, or the entire DB) alone can resolve the issue. It requires to rewrite all parts of the scripts handling the connection php/MySQL. Not easy and not my top priority.

» I'm blown away by your amazing answers (and reply speed)!

My pleasure.
» Since it will take me time to properly digest and reply all the details you just shared (especially on IUT, because IUT is new to me, and I think IUT might be the missing link I was searching for to deal with dependencies in multiple hypothesis testing), …

It’s not that complicated. Let’s explore the plot:
We have three tests. The areas give their type I errors. Since we perform all at the same level, the areas are identical. The Euclidean distance between centers give the correlation of PK metrics (here they are identical as well). The FWER is given by the area of the intersection which in any case will be ≤ the nominal α.

[image]In reality the correlation of AUC0–∞ (green) with AUC0–t (blue) is higher than the correlation of both with Cmax (red). If we would test only the AUCs, the FWER would by given again by the intersection which is clearly lower than the individual type I errors. If we add Cmax, the FWER decreases further.

Unfortunately the correlations are unknown. See this post for an example with two metrics and a reference how to deal with comparisons of multiple PK metrics.
In PowerTOST two simultaneous tests are implemented. Say we have a CV of 0.25 for both, a sample size of 28, and the highest possible correlation of 1.

# TIE of one test
power.TOST(CV = 0.25, n = 28, theta0 = 1.25)
# [1] 0.04999963
# TIE of two tests
power.2TOST(CV = c(0.25, 0.25), n = 28, theta0 = c(1.25, 1.25),
            rho = 1, nsims = 1e6)
# [1] 0.049928

Close match (based on simulations) and no inflation of the TIE. If ρ is lower, the test will get substantially more conservative.

» » Tmax is poor terminology. “T” denotes the absolute temperature in Kelvin whereas “t” stands for time. Hence, tmax should be used. ;-)
» I initially chose to use Tmax because I'm used to writing random variables in uppercase, …

Yep, that’s fine. Unfortunately “Tmax” is used in some regulatory documents… When I review papers, I have a standard rant pointing that difference out followed by “regulations  science”. 

» » The distribution strongly depends on the study’s sampling schedule but is definitely discrete. […] A log-transformation for discrete distributions is simply not allowed.
» Does it mean that the Nyquist criterion is not satisfied in pharmacokinetics (in terms of sampling interval)?

Duno. Sorry.

» Couldn't we just design our study’s sampling schedule to ensure that Nyquist criterion is satisfied so that we can perfectly reconstruct the original continuous-time function from the samples?

That’s wishful thinking. People tried a lot with [image] D-optimal designs in PK. What I do is trying to set up a model and explore different sampling schemes based on the variance inflection factor. Regrettably, it rarely works. Modeling absorption is an art rather than science – especially with lag-times (delayed onset of absorption due to gastric emptying, gastric-resistant coatings, :blahblah:). What if the model suggests to draw fifty blood samples (to “catch” Cmax/tmax in the majority of subjects) and you are limited to twenty? It’s always a compromise.

» […] I didn't dive in too deep to check if the moment-generating function of a log-normal distribution "matches" the (concentration-time profile?) model used in pharmacokinetics (because I'm still learning about the models used in pharmacokinetics), so it was more of a "visually looks a lot like a log-normal distribution" on my part :p

[image]Don’t fall into the trap of visual similarity. The fact that a concentration profile after an oral dose looks similar like a log-normal distribution is a mere coincidence. In a two-compartment model (right example: distribution is three times faster than elimination) the center of gravity is outside the profile; my “whizz wheel proof” would not work any more. Or even more extreme an intravenous dose…

Of note the mean of residence times \(MRT=\frac{AUMC}{AUC}\) is nice because we can compare different models (say, a one- with a two-compartment model). Independent from the model after MRT ~⅔ of the drug are eliminated. For years I try to educate clinicians to abandon half lives (which are less informative) but old believes die hard (see there, slides 24–28).

If you want to dive into the Kullblack-Leiber divergence note that any distributions can be compared.

The fact that we log-transform AUC and Cmax in BE has three reasons:
  1. The starting point: We need additive effects in the model.
  2. Empiric evidence that both PK metrics are skewed to the right. However, as said before only the model’s residuals have to be approximately normal. Hence, in principle any transformation pulling the right tail towards the median would do the job.
  3. The theoretical justification: The basic equation of PK is \(AUC=\frac{f\cdot D}{CL}\). We are interested in comparing the relative bioavailability of Test (\(f_T\)) with the one of Reference (\(f_R\)) as \(\frac{f_T}{f_R}\). This gives a set of two equations with two knowns (\(AUC_T,AUC_R\)) and four unknowns \((D_T,D_R,CL_T,CL_R)\). Since in most jurisdictions a correction for actual content is not allowed, we have to assume \(D_T=D_R\). Clearance is a property of the drug not the formulation and therefore, we assume further \(CL_T=CL_R\). Then we cancel these four variables out from the set of equations and obtain finally \(\frac{f_T}{f_R}=\frac{AUC_T}{AUC_R}\). Mission accomplished but it comes with a price. If we face between-occasion variability of clearance it goes straight into the residual variance. That’s the reason why we need high sample sizes for highly variable drugs – even if the absorption (the property of the formulation) is similar and shows low variability.
You see that the original model is multiplicative (based on ratios) and with the log-transformation we get the additive one we need. Hence, the selection of the transformation is not arbitrary.

Not sure what the current state of affairs are but in the past the Malaysian authority preferred the log10-transformation over loge. Not a particular good idea since PK is based on exponential functions. Furthermore, it makes our life miserable. The coefficient of variation based on the residual error of log-transformed data is given as \(CV=\sqrt{\textrm{e}^{MSE}-1}\). That’s the only one you find in textbooks and guidelines. If one used a log10-transformation the appropriate formula is \(CV=\sqrt{10^{\textrm{log}_{e}(10)\cdot MSE}-1}\). I have seen wrong sample size estimations where the former was used instead of the latter.

» » Les Benet once stated (too lazy to search where and when) that for a reliable estimate of AUC one has to sample in such a way that the extrapolated fraction is 5–20%. For AUMC one would need 1–5%.
» I'm familiar with Taylor Series Approximations, Fourier series approximation, etc. but never really thought about how the error terms of a moment-generating function is controlled if I use, for example, a log-normal distribution to describe the concentration-time profile

Forget about that.

» […] I kinda prefer spending my time learning about IUT instead of thinking about this

Good luck. Not that difficult.

» … so I was wondering if you have any good reference materials for learning about how to correctly use the moment-generating function to model a distribution in real-life? (e.g. concentration-time profile)

(1) No idea and (2) the approach of seeing a PK profile as a distribution is pretty exotic. :-D

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

 Admin contact
21,312 posts in 4,445 threads, 1,489 registered users;
online 5 (0 registered, 5 guests [including 4 identified bots]).
Forum time: Tuesday 05:39 UTC (Europe/Vienna)

Any one who considers arithmetical methods
of producing random digits is, of course,
in a state of sin.    John von Neumann

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz