Bioequivalence and Bioavailability Forum • EMA: New product-specific guidances

Helmut
★★★

Vienna, Austria,
2022-04-08 17:17
(1185 d 20:29 ago)

Posting: # 22918
Views: 14,212

EMA: New product-specific guidances [BE/BA News]

Dear all,

on April 04 the EMA published revised draft product-specific guidances for ibuprofen, paracetamol, and tadalafil. In the footnote on page 1 we find:

* This revision concerns defining what is meant by ‘comparable’ T_max as an additional main pharmacokinetic variable in the bioequivalence assessment section of the guideline.

Then:

Bioequivalence assessment: Comparable median (≤ 20% difference) and range for T_max.

See this article why I consider this invention crap.
In short: t_max follows a discrete distribution on an ordinal scale. Calculating the ratio of values is a questionable procedure (strictly speaking not an allowed operation at all: Only addition, subtraction, ranking are).

[image]

End of consultation 31 July 2022.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2022-04-09 13:45
(1185 d 00:01 ago)

@ Helmut
Posting: # 22919
Views: 12,516

How would you implement it?

Post reply

Hi Helmut and all,

I was a little afraid of this. Thanks a lot for posting.

❝ Comparable median (≤ 20% difference) and range for T_max

I am confused. I can see the point in regulating the matter, but I feel there is a lot that's left to be answered.

But let me ask all of you.

Question 1
Do you read this as: You need to be comparable (20% diff) for the median AND for the range? (i.e. is there also a 20% difference requirement for the range???)

❝ Calculating the ratio of values is a questionable procedure.

Question 2:
Whether we like it or not, we have to find a way forward. And this has a lot of degrees of freedom. In a nonparametric universe where we try to resolve it There could be all sorts of debate re. Hodges-Lehman, Kruskal-Wallis, Wilcoxon test, confidence levels, bootstrapping, pairing and what not.

So, kindly allow me to throw this on the table:

Imagine we have these Tmax levels for T and R in a BE trial, units could be hours.

Tmax.T=c(2.0, 2.5, 2.75, 2.5, 2.0, 2.5, 1.5, 2.0)

Tmax.R=c(4.0, 3.0, 3.5, 3.75, 1.5, 3.5, 5.5, 4.5)

Let us implement something that has the look and feel of a test along the lines of what regulators want.

Test1=wilcox.test(Tmax.T,  alt ="two.sided", conf.int = T, correct=T, conf.level=.90)

Test1$conf.int;

I am getting a CI of 1.999929-2.500036 (2 to 2.5).
We can compare this with
median(Tmax.R)*c(0.8, 1.2)
The test fails. Would that be a way to go?

No? then how about:

Test2=wilcox.test(Tmax.T/Tmax.R,  alt ="two.sided", conf.int = T, correct=T, conf.level=.90)

Test2

Test2$conf.int;

I am getting 0.47-0.92. Not within 0.8 to 1.25, test fails.

Shoot me.
How would you prefer to implement the comparability exercise for Tmax? (I am not so much interested in your thoughts on alpha/confidence level, exact T or F, etc. I am mainly interested in a way to make the comparison itself, so please make me happy and focus on that :rotfl:

).

Mind you, the data above might be paired :-D

.... or it might not, depends on whether it was from an XO or not. This add complexity, all depending on the implementation.

And, question 3, if the comparability thing also applies to range, how to implement that?

And question 4, sample size calculation is going to get messy for these products, if we have to factor in comparability of Tmax at the 20% level. I am not outright saying I have a bad taste in my mouse, but I am leaning towards thinking this could easy translate into a complete showstopper for sponsors developing the products. What's your gut feeling?

At the end of the day answers to Q1-Q4 above hinge not only on what you think is the right thing to do; of equal importance is what you think regulators will accept. :-)

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2022-04-09 20:40
(1184 d 17:06 ago)

@ ElMaestro
Posting: # 22920
Views: 12,635

Confuse a Cat Inc.

Post reply

Capt’n, my capt’n!

❝ ❝ Comparable median (≤ 20% difference) and range for T_max

❝ Do you read this as: You need to be comparable (20% diff) for the median AND for the range? (i.e. is there also a 20% difference requirement for the range???)

Oh dear, I missed that! The range has a breakdown point of 0 – even if all values are identical except one of them, this single value will change the range. On the other hand, if you have two ’contaminations’ on opposite sides, the range will be the same. [image]

-script for simulations acc. to my experiences with ibuprofen at the end. Gives:

Min. 1st Qu. Median 3rd Qu. Max. Range T 1.00000 1.33333 1.83333 2.16667 2.50000 1.5 R 0.83333 1.29167 1.58333 2.16667 2.33333 1.5 T - R 0.16667 0.04167 0.25000 0.00000 0.16667 0.0 T / R 1.20000 1.03226 1.15789 1.00000 1.07143 1.0

Close shave for the median (+16%). Here no problems with the range because we have two contaminations.
A goody: Replace in the script the lines

x <- rtruncnorm(n = n, a = a, b = b, mean = mu, sd = mu * CV) T <- R <- roundClosest(x, spl) T[T == min(T)] <- max(T) R[R == max(R)] <- min(R)

R <- roundClosest(rtruncnorm(n = n, a = a, b = b, mean = mu, sd = mu * CV), spl) T <- R + 20 / 60

to delay T by 20 minutes.

Min. 1st Qu. Median 3rd Qu. Max. Range T 1.16667 1.66667 2.00000 2.50000 2.83333 1.66667 R 0.83333 1.33333 1.66667 2.16667 2.50000 1.66667 T - R 0.33333 0.33333 0.33333 0.33333 0.33333 0.00000 T / R 1.40000 1.25000 1.20000 1.15385 1.13333 1.00000

❝ ❝ Calculating the ratio of values is a questionable procedure.

❝

❝ Whether we like it or not, we have to find a way forward. And this has a lot of degrees of freedom.

I’ve read a lot in the meantime. Still not sure whether it is allowed at all (‼) to calculate a ratio of discrete values with potentially unequal intervals.

❝ In a nonparametric universe where we try to resolve it There could be all sorts of debate re. Hodges-Lehman, Kruskal-Wallis, Wilcoxon test, confidence levels, bootstrapping, pairing and what not.

Didn’t have the stamina to figure out why you get so many warnings in your code. However, you are aware that nonparametrics gives the EMA an anaphylactic shock?

❝ Shoot me.

Later.

❝ How would you prefer to implement the comparability exercise for Tmax? (I am not so much interested in your thoughts on alpha/confidence level, exact T or F, etc. I am mainly interested in a way to make the comparison itself, so please make me happy and focus on that :rotfl: ).

I’m working on it.

❝ […] if the comparability thing also applies to range, how to implement that?

Sorry, I think that’s just bizarre. Honestly, despite you excellent exegesis, I guess (or rather hope?) that only the median is meant. If otherwise, wouldn’t the almighty oracle written this:

Comparable (≤ 20% difference) median and range for T_max.

❝ […] sample size calculation is going to get messy for these products, if we have to factor in comparability of Tmax at the 20% level. I am not outright saying I have a bad taste in my mouse, but I am leaning towards thinking this could easy translate into a complete showstopper for sponsors developing the products. What's your gut feeling?

From some preliminary simulations I guess that we would need somewhat tighter sampling intervals than usual in order to ‘catch’ t_max in any and every case.

❝ At the end of the day answers to Q1-Q4 above hinge not only on what you think is the right thing to do; of equal importance is what you think regulators will accept. :-)

Of course. If we would go back to the 2001 Note for Guidance (and the current one of the WHO), with a nonparametric test and pre-specified acceptance range everything would be much easier.

P.S.: I updated the article. It’s a work in progress. Perhaps you will come up with more questions.

library(truncnorm) roundClosest <- function(x, y) { # Round x to the closest multiple of y return(y * round(x / y)) } sum.simple <- function(x) { # Nonparametric summary: Remove Mean but keep eventual NAs y <- summary(x) if (length(y) == 6) { y <- y[c(1:3, 5:6)] } else { y <- y[c(1:3, 5:7)] } names <- c(names(y), "Range") y[length(y) + 1] <- y[length(y)] - y[1] y <- setNames(as.vector(y), names) return(y) } set.seed(123456) spl <- 10 / 60 # Sampling every 10 (!) minutes a <- 40/60 # Lower limit of truncated normal (40 minutes) b <- 150/60 # Upper limit of truncated normal (2.5 hours) mu <- 100/60 # Common for ibuprofen CV <- 0.50 # Realistic acc. to my studies n <- 16 # Sufficient given the low variability x <- rtruncnorm(n = n, a = a, b = b, mean = mu, sd = mu * CV) T <- R <- roundClosest(x, spl) # both are identical! # ‘Contaminations’ T[T == min(T)] <- max(T) R[R == max(R)] <- min(R) res <- as.data.frame(matrix(ncol = 6, nrow = 4, dimnames = list(c(" T", " R", "T - R", "T / R"), names(sum.simple(T))))) res[1, ] <- sum.simple(T) res[2, ] <- sum.simple(R) res[3, ] <- res[1, ] - res[2, ] res[4, ] <- res[1, ] / res[2, ] print(round(res, 5))

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2022-04-09 23:47
(1184 d 13:59 ago)

@ Helmut
Posting: # 22921
Views: 12,359

Confuse a Cat Inc.

Post reply

Hi Hötzi,

should we submit a dataset to EMA and suggest them to publish it along with a description (numerical example) of how exactly they wish to derive the decision?

When FDA indicated that they were going in the direction of in vitro popBE for inhalanda and nasal sprays they published a dataset and showed exactly how to process the data to figure out the pass / fail criterion that satisfies the regulator. If EMA would do the same here we'd have all doubt eliminated.
I think we need to know exactly:
1. Do we use nonparametrics or not?
2. Do we use logs or not?
3. Is the decision of 20% comparability based on a confidence interval or on something else?
3a. If there is a CI involved, is it a 90% or 95% CI or something else?
4. Are we primarily working on ratio or on a difference?
5. Is the bootstrap involved?
6. How should we treat datasets from parallel trials, and how should we treat data from XO (i.e. how to handle considerations of paired and non-paired options)?

My gut feeling is that they want nonparametrics for the Tmax comparability part (yes I am aware of the sentence).

Actually, perhaps they just want the decision taken on basis of the estimates of medians and ranges from min to max?

If we submit a dataset, let us make sure we submit one with ties (the one I pasted above had none).

—
Pass or fail!
ElMaestro

Ohlbe
★★★

France,
2022-04-11 13:29
(1183 d 00:17 ago)

@ ElMaestro
Posting: # 22923
Views: 12,507

Confuse a Cat Inc.

Post reply

Hi ElMaestro,

❝ My gut feeling is that they want nonparametrics for the Tmax comparability part (yes I am aware of the sentence).

My gut feeling is that all they expect to get are descriptive statistics: report median Tmax for Test, median Tmax for Reference, calculate a % difference (however inappropriate this may be), pass if it is not more than 20%, otherwise fail.

Consequence: if you have more than 20% difference between sampling times around the expected Tmax, you're screwed if median Tmax values are different even by just one sampling time, even if this has strictly no clinical relevance (this could be brought up in the comments to the draft guideline: come on guys, are you sure a Tmax of 10' for one formulation and 15' for the other is really something totally unacceptable ? I mean, even for tadalafil you should be able to keep yourself busy until it works).

Range: no expectation described. No idea.

Of course I may be totally wrong.

—
Regards
Ohlbe

Helmut
★★★

Vienna, Austria,
2022-04-11 15:59
(1182 d 21:47 ago)

@ Ohlbe
Posting: # 22925
Views: 12,410

Confuse a Cat Inc.

Post reply

Hi Ohlbe,

❝ My gut feeling is that all they expect to get are descriptive statistics: report median Tmax for Test, median Tmax for Reference, …

So far so good. Standard for ages.

❝ … calculate a % difference (however inappropriate this may be), …

It is indeed.

❝ … pass if it is not more than 20%, otherwise fail.

That’s my understanding as well.

❝ Consequence: if you have more than 20% difference between sampling times around the expected Tmax, you're screwed if median Tmax values are different even by just one sampling time, …

Correct. IMHO, you need

equally spaced intervals until absorption is essentially complete (in a one compartment model at least two times the expected t_max in all subjects) and
likely narrower intervals than usual.

As shown in my example in the other post t_max will drive the sample size. How much larger will it have to be? Not the slightest idea. Likely much larger.

❝ … even if this has strictly no clinical relevance (this could be brought up in the comments to the draft guideline: come on guys, are you sure a Tmax of 10' for one formulation and 15' for the other is really something totally unacceptable ?

Exactly. Recall what the almighty oracle stated in the BE-GL:

[…] if rapid release is claimed to be clinically relevant and of importance for onset of action or is related to adverse events, there should be no apparent difference in median t_max and its variability between test and reference product.

It boils down to: Is it clinically relevant? If not, a comparison is not required. Furthermore: PK ≠ PD.

❝ I mean, even for tadalafil you should be able to keep yourself busy until it works).

Tadalafil shows an effect before t_max. So what?
Not by chance. It’s common that the time point of E_max is < t_max.

❝ Range: no expectation described. No idea.

The range is completely useless. Like the mean it has a breakdown point of zero. Imagine with $\small{n\rightarrow \infty}$: $$\small{\left\{R_1=1,\ldots, R_n=1\phantom{.25}\right\} \rightarrow \textrm{Range}(R)=0\phantom{.25}}\\\small{\left\{T_1=1,\ldots, T_n=1.25\right\} \rightarrow \textrm{Range}(T)=0.25}$$ Good luck in calculating a ratio.

❝ Of course I may be totally wrong.

So am I.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Ohlbe
★★★

France,
2022-04-11 17:48
(1182 d 19:58 ago)

@ Helmut
Posting: # 22926
Views: 12,168

Range

Post reply

Hi Helmut,

❝ ❝ Range: no expectation described. No idea.

❝

❝ The range is completely useless. Like the mean it has a breakdown point of zero. Imagine with $\small{n\rightarrow \infty}$: $$\small{\left\{R_1=1,\ldots, R_n=1\phantom{.25}\right\} \rightarrow \textrm{Range}(R)=0\phantom{.25}}\\\small{\left\{T_1=1,\ldots, T_n=1.25\right\} \rightarrow \textrm{Range}(T)=0.25}$$ Good luck in calculating a ratio.

I guess it depends what they mean by "range". Are they using the word in its statistical meaning (difference between the largest and smallest values) or in its lay language meaning (limits between which something varies) ? I suspect the latter: lowest and highest observed T_max values.

—
Regards
Ohlbe

Helmut
★★★

Vienna, Austria,
2022-04-11 18:07
(1182 d 19:39 ago)

@ Ohlbe
Posting: # 22927
Views: 12,191

Range

Post reply

Hi Ohlbe,

❝ I guess it depends what they mean by "range". Are they using the word in its statistical meaning (difference between the largest and smallest values) or in its lay language meaning (limits between which something varies) ? I suspect the latter: lowest and highest observed T_max values.

You mean that we only have to report the minimum and maximum t_max of T and R? But how should we understand this:

Comparable ~~median (≤ 20% difference) and~~ range for T_max.

How to decide what is ‘comparable’? Ask ten people and get twelve answers?
The range it given in the section ‘Bioequivalence assessment’. We report other stuff as well (λ_z/t_½, AUC_0–t/AUC_0–∞, :blahblah:

). So why does it sit there?

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Ohlbe ★★★ France, 2022-04-11 18:20 (1182 d 19:26 ago) @ Helmut Posting: # 22928 Views: 12,160	Range Post reply
	Hi Helmut, ❝ But how should we understand this: Comparable ~~median (≤ 20% difference) and~~ range for T_max. Oh, the exact same way as before: nobody knows. — Regards Ohlbe

Helmut
★★★

Vienna, Austria,
2022-07-28 16:43
(1074 d 21:03 ago)

@ Ohlbe
Posting: # 23186
Views: 10,990

Range

Post reply

Hi Ohlbe & all,

sorry to excavate this corpse (I submitted my comments last Sunday; 34 pages…)

❝ Are they using the word in its statistical meaning (difference between the largest and smallest values) or in its lay language meaning (limits between which something varies) ? I suspect the latter: lowest and highest observed T_max values.

Discovered another one. In the first draft (EMA/CHMP/356876/2017 of July 2017) the bizarre “Comparable median and range for T_max” was given.
Stakeholders asked for a clarification and in the comments (EMA/CHMP/730723/2017 of May 2018) we find:

T_max is not an end point to be included in the statistical analysis but a comparison of the values should be made and any differences discussed in the context of the application (see later).

This response could be understand that a comparison of t_max should only be discussed. Regrettably there was nothing to be ‘seen later’.
Funny enough the PKWP’s own statement didn’t make it to guidance of May 2018.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2022-04-11 15:03
(1182 d 22:44 ago)

@ ElMaestro
Posting: # 22924
Views: 12,603

So many questions, so few answers

Post reply

Hi ElMaestro,

❝ should we submit a dataset to EMA and suggest them to publish it along with a description (numerical example) of how exactly they wish to derive the decision?

Maybe better not only a data set but also a proposed method for evaluation.

❝ When FDA indicated that they were going in the direction of in vitro popBE for inhalanda and nasal sprays they published a dataset and showed exactly how to process the data to figure out the pass / fail criterion that satisfies the regulator.

I have a counter-example. This goody was recommended in Apr 2014 and revised in Jan 2020. Nobody knew how the FDA arrived at the BE-limits and why. Last month I reviewed a manuscript explaining the background. Made sense (based on lots of data of the originator) but for years it was a complete mystery.

❝ If EMA would do the same here we'd have all doubt eliminated.

Utopia. Notoriously the EMA comes up with unsubstantiated ‘inventions’ made up out of thin air and leaves it to us to figure out if and how they work. Examples?

Rounded regulatory constant k = 0.760 and upper cap at CV_wR = 50% in reference-scaling,
sequence(stage) term in TSDs,
‘substantial’ accumulation if AUC_0–τ > 90% of AUC_0–∞,
for partial AUCs default cut-off time τ/2,
comparison of C_ss,min for originators but of C_ss,τ for generics,
…

❝ I think we need to know exactly:

❝ 1. Do we use nonparametrics or not?

Guess.

❝ 2. Do we use logs or not?

Logs? Possibly t_max follows a Poisson distribution. ~~I will try to compile data from my studies.~~ See Fig. 2 and Fig. 3 in the article.

❝ 3. Is the decision of 20% comparability based on a confidence interval or on something else?

Likely the former; made up out of thin air.

❝ 3a. If there is a CI involved, is it a 90% or 95% CI or something else?

90%.

❝ 4. Are we primarily working on ratio or on a difference?

IMHO, calculating ratios of discrete values with potentially unequal intervals is plain nonsense. Data are on an ordinal scale. Only (‼) allowed operations: addition, subtraction, ranking. Nothing else.

❝ 5. Is the bootstrap involved?

A possible approach but why?

❝ 6. How should we treat datasets from parallel trials, and how should we treat data from XO (i.e. how to handle considerations of paired and non-paired options)?

I would suggest the Mann–Whitney U test (parallel) and the Wilcoxon signed-rank test (paired / crossover). Requires some tricks in case of tied observations (practically always) for the exact tests, e.g., function wilcox_test() of package coin instead of [image]

wilcox.test().

❝ My gut feeling is that they want nonparametrics for the Tmax comparability part (yes I am aware of the sentence).

I doubt it. Really.

❝

Actually, perhaps they just want the decision taken on basis of the estimates of medians and ranges from min to max?

I think so (see also Ohlbe’s post). However, that’s statistically questionable (politely speaking). See the updated article and hit F5 to clear the browser’s cached version.

❝ If we submit a dataset, let us make sure we submit one with ties (the one I pasted above had none).

It’s extremely unlikely that you will find one without…

I explored one of my studies. Ibuprofen 600 mg tablets, single dose, fasting, 2×2×2 crossover, 16 subjects (90% target power for C_max), sampling every 15 minutes till 2.5 hours. Re-sampled the reference’s t_max in 10⁵ simulations and applied the ‘≤±20% criterion’:

median re-sampled med. diff (%) pass.pct Min. :1.000 Min. :1.000 Min. :-50.000 Mode :logical 1st Qu.:1.375 1st Qu.:1.375 1st Qu.:-15.385 FALSE:34887 Median :1.500 Median :1.500 Median : 0.000 TRUE :65113 3rd Qu.:1.750 3rd Qu.:1.750 3rd Qu.: 18.182 Max. :2.500 Max. :2.500 Max. :100.000

[image]
Dashed lines 2.5 and 97.5 percentiles

Power compromised. In ≈35% of simulated studies the reference could not be declared equivalent to itself.
Bonus question: Which distribution? Skewness +0.627.

The generic tested in this study was approved 25 years ago and is still on the market. Any problems?

If you want to give it a try:

R <- c(1.25, 2.00, 1.00, 1.25, 2.50, 1.25, 1.50, 2.25, 1.00, 1.25, 1.50, 1.25, 2.25, 1.75, 2.50, 2.25)

P.S.: Amazing that this zombie rises from the grave. See this post and this thread of June 2013…

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2022-04-30 16:59
(1163 d 20:47 ago)

@ ElMaestro
Posting: # 22945
Views: 11,940

Preliminary simulations

Post reply

Hi ElMaestro,

❝ 2. Do we use logs or not?

As the ‘Two Lászlós’ – who else? – wrote*

The concentrations rise more steeply before the peak than they decline following the true maximum response. Consequently, it is more likely that large observed concentrations occur after than before the true peak time ($\small{T_\textrm{max}^\circ}$).

Apart from this theoretical consideration they demonstrated in simulations that the distribution of the observed t_max is not only biased but skewed to the right.

I could confirm that in my studies. Given that, your idea of using logs in simulations (but not in the evaluation) is excellent! It turned out that a CV of 50% is not uncommon – even with a tight sampling schedule.

I set up simulations. Say, we have a drug with an expected median t_max of R of 1.5 hours. If we aim to power the study at 90% for a moderate CV of C_max of 20% we need 26 subjects. Let’s sample every 15 minutes and assume a shift in t_max for T of 15 minutes (earlier). Result of 10,000 simulations:

Sample size based on conventional PK metric BE limits : 80.00% to 125.00% theta0 : 95% (assumed T/R ratio) CV : 20% (assumed) Power : 90% (target) n : 26 (estimated) Power : 92% (achieved) Sampling : every 15 minutes mu.tmax (R) : 1.50 h (assumed for R) CV of tmax : 50% (assumed) shift : -15.0 min (assumed for T) spread : ±30.0 min Not equivalent if medians differ by more than 20% {-18 min, +18 min} Empiric power: 76.00%

Oops!

How many subjects would we need to preserve our target power?

n : 78 (specified) Power : 100% Not equivalent if medians differ by more than 20% {-18 min, +18 min} Empiric power: 90.71%

Three times as many subjects. Not funny.

[image]

Another issue is the assumed shift in location, which is nasty. I tried –20 minutes (instead of –15).

[image]

I stopped the iterations with 500 (‼) subjects (power ≈82%). I’m not sure whether 90% is attainable even with a more crazy sample size. Note that the x-axis is in log-scale.

On the other hand, the spread is less important, since we are assessing medians. In other words, extreme values are essentially ignored. Then with the original shift of –15 minutes but a spread of ±60 minutes (instead of 30), we have already ≈80% power with 36 subjects and ≈90% with 66.

[image]

Is this really the intention? The wider the range in t_max, the more easily products will pass. Counterintuitive.

Furthermore, as a consequence of the skewed distribution power curves are not symmetrical around zero – what the PKWP possibly (or naïvely?) assumed. It reminds me on the old days when the BE-limits were 80–120% and maximum power was achieved at a T/R-ratio of ≈0.98 (see there).

[image]

Consequently, a ‘faster’ test will more likely pass than a ‘slower’ one. That’s not my understanding of equivalence. Guess the Type I Error.

Tóthfálusi L, Endrényi L. Estimation of C_max and T_max in Populations After Single and Multiple Drug Administration. J Pharmacokin Pharmacodyn. 2003; 30(5): 363–85. doi:10.1023/b:jopa.0000008159.97748.09.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro
★★★

Denmark,
2022-04-30 21:10
(1163 d 16:36 ago)

@ Helmut
Posting: # 22946
Views: 11,699

Preliminary simulations

Post reply

Hi Hötzi,

your post is potentially highly significant.

❝ Is this really the intention? The wider the range in t_max, the more easily products will pass. Counterintuitive.

The intention, as I understand it, was exactly the opposite. It goes completely against all intention, doesn't it?
I think this knowledge, if it holds in confirmatory simulations, should be published quickly and made available to regulators.

I very much hope that regulators will abstain completely from letting pride prevail over the regard for the EU patient. So I hope they will not dismiss the argumentation. For example, I could fear they would dismiss your findings because they don't have a palate for simulations (but note they like simulations well enough when it comes to f2; bootstrapping is a simulation, too).

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2022-05-01 17:56
(1162 d 19:50 ago)

@ ElMaestro
Posting: # 22947
Views: 11,631

Preliminary simulations

Post reply

Hi ElMaestro,

❝ your post is potentially highly significant.

Potentially!

❝ ❝ Is this really the intention? The wider the range in t_max, the more easily products will pass.

❝

❝ The intention, as I understand it, was exactly the opposite. It goes completely against all intention, doesn't it?

Right, I don’t get it. I’m afraid, this is one joins the ‘methods’ of the PKWP made up out of thin air.

EMEA. The European Medicines Evaluation Agency.
The drug regulatory agency of the European Union.
A statistician-free zone. Stephen Senn. Statistics in Drug Development. Wiley; 2004. p. 386.

‘Common sense’ is not always a good idea. The obvious is most often wrong.

It’s complicated. Like yesterday but this time no shift.

[image]

That’s what I expect.

❝ I think this knowledge, if it holds in confirmatory simulations, should be published quickly and made available to regulators.

That’s wishful thinking taking the review process into account. Deadline for comments July 31^st.

❝ […] I could fear they would dismiss your findings because they don't have a palate for simulations (but note they like simulations well enough when it comes to f2; bootstrapping is a simulation, too).

Yep. ƒ₂ is not a statistic (depends on the number of samples and intervals). Given that, no closed form to estimate the location, its CI, power (and hence, the Type I Error) exists because the distribution is unknown. That’s similar to the situation we are facing here.

PS: Du hast Mehl in deiner Mehlkiste.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2023-06-23 15:29
(744 d 22:17 ago)

@ ElMaestro
Posting: # 23612
Views: 9,424

Revisions of the PSGLs final

Post reply

Hi ElMaestro & all,

❝ ❝ Is this really the intention? The wider the range in t_max, the more easily products will pass. Counterintuitive.

❝ The intention, as I understand it, was exactly the opposite. It goes completely against all intention, doesn't it?

❝ I think this knowledge, if it holds in confirmatory simulations, should be published quickly and made available to regulators.

❝ I very much hope that regulators will abstain completely from letting pride prevail over the regard for the EU patient. So I hope they will not dismiss the argumentation. For example, I could fear they would dismiss your findings because they don't have a palate for simulations (but note they like simulations well enough when it comes to f2; bootstrapping is a simulation, too).

Final revisons of the PSGLs were published yesterday.

Practically all comments/suggestions were ~~ignored~~ not accepted. The only relevant change is 80.00–125.00% of the reference’s median (from 80.00–120.00%). Good luck in sampling every five minutes.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

dshah
★★

India,
2023-06-28 16:43
(739 d 21:03 ago)

@ Helmut
Posting: # 23632
Views: 9,249

Revisions of the PSGLs final

Post reply

Thank you Helmut for the update.

❝ Practically all comments/suggestions were ~~ignored~~ not accepted. The only relevant change is 80.00–125.00% of the reference’s median (from 80.00–120.00%). Good luck in sampling every five minutes.

But will increase in sampling lead to higher blood loss and then proposal to reduce sampling by IRB/IEC? Or will there be any hope of great BA method so that sampling at every 5/10 min is feasible with minimal blood loss?

Regards,
Divyen

Helmut
★★★

Vienna, Austria,
2023-06-28 17:59
(739 d 19:47 ago)

@ dshah
Posting: # 23633
Views: 9,292

EMA: No problems with many sampling time points…

Post reply

Hi Divyen,

❝ ❝ […] Good luck in sampling every five minutes.

❝

❝ But will increase in sampling lead to higher blood loss and then proposal to reduce sampling by IRB/IEC? Or will there be any hope of great BA method so that sampling at every 5/10 min is feasible with minimal blood loss?

Just to quote the overview of comments to the ibuprofen guidance:

Comment: This will lead to credible feasibility and ethical concerns.
Outcome: Every 5 minutes around T_max is enough. Every 15 minutes is the standard frequency, therefore, there is no critical change. No ethical concern is anticipated if the volume of blood is not excessive. The present bioanalytical methods allow to sample less than 300 ml of blood in total in studies with up to 25 samples. Therefore a few more samples for a proper characterisation of T_max and C_max are not considered an ethical problem.

Well roared, lion!

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2023-06-29 13:34
(739 d 00:12 ago)

@ dshah
Posting: # 23634
Views: 9,376

New simulations & some desultory thoughts

Post reply

Hi Divyen & all,

I simulated IR ibuprofen.

One-compartment model, D = 400 mg, V = 7 L (lognormal distribution, CV 40%), ƒ = 0.9 (uniform distribution 0.8 – 1.0), t_½ = 2 h. Associated k₁₀-values (lognormal distribution, CV 25%). Seven formulations with t_max 1.25 h (Reference), at the lower (1.000 h) and upper (1.562 h) ‘limits’, fast (1.125 h, 1.188 h), and slow (1.316 h, 1.389 h). Associated k₀₁-values (lognormal distribution, CV 35%), analytical error (normal distribution, CV 7.5%), LLOQ set to 5% of the reference’s error-free model C_max. Concentrations <LLOQ before t_max set to zero, and after to NA. Lots of samples…
16 subjects in order to achieve ≥80% power for C_max (CV 18%, T/R 0.95).

Lengthy [image]

-script (302 LOC) upon request. I got:

Simulation settings: 2,500 studies with 16 subjects Sampling every five minutes up to 2 × tmax of R (2.50 h), exponentially increasing intervals to tlast (16 h) = 38 samples. 0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150 min, 3.5, 4, 5.5, 7, 9.5, 12.5, 16 h Seven formulations L = lower limit: tmax = 1.000 h, ka = 2.190 / h, t½,a = 19.0 min T1 = fastest : tmax = 1.125 h, ka = 1.822 / h, t½,a = 22.8 min T2 = fast : tmax = 1.188 h, ka = 1.672 / h, t½,a = 24.9 min R = Reference : tmax = 1.250 h, ka = 1.539 / h, t½,a = 27.0 min T3 = slow : tmax = 1.316 h, ka = 1.417 / h, t½,a = 29.4 min T4 = slowest : tmax = 1.389 h, ka = 1.296 / h, t½,a = 32.1 min U = upper limit: tmax = 1.562 h, ka = 1.065 / h, t½,a = 39.0 min Simulation results: L = lower limit Median : 1.0833 h (Range: 0.7500 - 1.4583 h) Skewness: +0.4452 (Bias: +0.0833) T1 = fastest Median : 1.2083 h (Range: 0.8750 - 1.6250 h) Skewness: +0.4093 (Bias: +0.0833) T2 = fast Median : 1.2917 h (Range: 0.9167 - 1.7917 h) Skewness: +0.3858 (Bias: +0.1042) R = Reference Median : 1.3333 h (Range: 1.0000 - 1.7917 h) Skewness: +0.3505 (Bias: +0.0833) T3 = slow Median : 1.4167 h (Range: 1.0000 - 1.8750 h) Skewness: +0.2960 (Bias: +0.1009) T4 = slowest Median : 1.5000 h (Range: 1.1250 - 2.0000 h) Skewness: +0.2594 (Bias: +0.1111) U = upper limit Median : 1.6667 h (Range: 1.2500 - 2.2500 h) Skewness: +0.0878 (Bias: +0.1042) Comparisons: passed ‘±20% median criterion’ (80.00-125.00%) L = lower limit: 52.1% T1 = fastest : 80.9% T2 = fast : 90.4% T3 = slow : 91.2% T3 = slowest : 84.8% U = upper limit: 57.0%

The positive skewness of t_max-values confirmed the theoretical considerations of the two Lászlós.¹ Interesting that the skewness decreased with increasing t_max. All medians were positively biased when compared to the models’ true values.

What changed to the simulations I presented in the comments to the guidance?

The chance to pass is higher due to the upper limit of 125% (instead of 120%).
As mentioned by other stakeholders, the approach favors slower formulations – even more than with the draft’s 120% limit. Surprisingly we read on page 16:
- It is not agreed that slower formulations will be developed to use a wider acceptance range since quicker formulations are desired from a clinical and marketing point of view.
Whatever the development’s target might be, slower formulations will more likely pass. Full stop. How does marketing enter the game in bioequivalence? Is it OK nowadays to claim in the SmPC: “The product releases quicker than the originator”?
If a study is powered for C_max and the median t_max of a formulation does not deviate more than ~8 (‼) minutes from the reference, it will likely pass.
However, is a Δ of ±8 minutes really clinically relevant for a pain-killer? I doubt it.
If a formulation’s t_max is expected to lie close to the limits, one has to increase the sample size substantially. On page 17 we read:
- The sample size required for the demonstration of bioequivalence for C_max and AUC should be sufficient to obtain a reliable and representative median T_max.
Really? Probably it’s cheaper to toss a coin…
On page 13 about the sampling schedule:
- […] if T_max is expected 30 minutes after administration, sampling should be at e.g. 10, 20, 25, 30, 35, 40, etc., which would allow to conclude that the difference is less than 5 minutes if the T_max is observed in the same or an adjacent sampling time.
I doubt it. My simulations at the end. Power compromised.
A picky comment on page 25:
- ‘T’ is the SI symbol for the absolute temperature. Use the correct SI symbol ‘t’ for time, at least for consistency with the overarching guideline.
Was accepted but ignored at the end of the day.
On page 26 about the comparison of ranges:
- The assessment of the range is more subjective. If all the values except one are the same, the ranges would be considered acceptable. Therefore, only if differences are evident and worse for the test product, the range could be used for a regulatory decision.
I see. What might by an ‘evident difference’ for an assessor? Is that better than the ‘apparent difference’ we have in the IR and MR GLs?
On page 28:
- The comparison of the medians does not intend to preserve the type 1 error [sic] but to exclude formulations with different onset of action.
Aha! Patient’s risk not important.
On page 29 about the proposed CI-interval inclusion approach based on nonparametric methods:
- […] the power of a statistical test (usually be performed using a confidence interval), and consequently the sample size needed, will depend on the requested equivalence range and significance level (the allowed type-1 error rate). Equivalence range could be wider than the range that is applied for point estimate. Also, the allowed type-1 error rate (or equivalently, the coverage probability of the confidence interval) may be less strict than for AUC and C_max. This would allow for assessing the consumers risk for T_max but on a different level than for AUC and C_max. Still an agreement on both, equivalence range and significance level to be used, may be difficult to achieve.
IMHO, that’s funky. Instead of taking the effort of defining an equivalence range based on a clinically relevant Δ, the EMA has chosen the easiest way. “The allowed type-1 error rate […] may be less strict than for AUC and C_max.” I beg your pardon? If t_max is considered less important than AUC and C_max, specify wider limits! Don’t even think about fiddling with α.
- It is considered that while the Hodges-Lehmann estimator is an adequate estimator to compare T_max of Test (generic) and Reference (innovator) products […] the present revision of the product specific guideline concerns […] not introducing a new method particularly one for which EMA experience in regulatory submissions is limited.
C’mon, “limited experience”! What about my 600+ studies? A nonparametric method was even recommended for 19 years^2,3 (till the 2010 GL). How many studies dwell in the gloomy dungeons of the agency? Exhume them and perform retrospective assessments.

Simulation settings: 2,500 studies with 16 subjects Sampling every five minutes up to 2 × tmax of R (1.00 h), exponentially increasing intervals to tlast (16 h) = 18 samples. 0, 10, 20, 25, 30, 35, 40, 45, 50, 55, 60 min, 1.5, 2, 3.5, 5, 7, 11, 16 h Seven formulations L = R –5 min : tmax = 25 min, ka = 7.828 / h, t½,a = 5.31 min T1 = pretty fast: tmax = 27 min, ka = 7.037 / h, t½,a = 5.91 min T2 = fast : tmax = 28 min, ka = 6.527 / h, t½,a = 6.37 min R = Reference : tmax = 30 min, ka = 6.074 / h, t½,a = 6.85 min T3 = slow : tmax = 32 min, ka = 5.650 / h, t½,a = 7.36 min T4 = pretty slow: tmax = 33 min, ka = 5.233 / h, t½,a = 7.95 min U = R +5 min : tmax = 35 min, ka = 4.881 / h, t½,a = 8.52 min Simulation results: L = R –5 min Median : 0.5000 h (Range: 0.3333 - 0.6667 h) Skewness: +0.4994 (Bias: +0.0833) T1 = pretty fast Median : 0.5417 h (Range: 0.3750 - 0.7500 h) Skewness: +0.4606 (Bias: +0.0917) T2 = fast Median : 0.5417 h (Range: 0.3750 - 0.7500 h) Skewness: +0.4086 (Bias: +0.0667) R = Reference Median : 0.5833 h (Range: 0.4167 - 0.8333 h) Skewness: +0.3538 (Bias: +0.0833) T3 = slow Median : 0.5833 h (Range: 0.4167 - 0.8333 h) Skewness: +0.2857 (Bias: +0.0570) T4 = pretty slow Median : 0.6250 h (Range: 0.4167 - 0.8333 h) Skewness: +0.2113 (Bias: +0.0694) U = R +5 min Median : 0.6667 h (Range: 0.4583 - 0.8750 h) Skewness: +0.1550 (Bias: +0.0833) Comparisons: passed ‘±20% median criterion’ (80.00-125.00%) L = R –5 min : 73.7% T1 = pretty fast: 83.2% T2 = fast : 87.1% T3 = slow : 88.3% T3 = pretty slow: 82.6% U = R +5 min : 76.8%

Tóthfálusi L, Endrényi L. Estimation of C_max and T_max in Populations After Single and Multiple Drug Administration. J Pharmacokin Pharmacodyn. 2003; 30(5): 363–85. doi:10.1023/b:jopa.0000008159.97748.09.
Commission of the EC. Note for Guidance. Investigation of Bioavailability and Bioequivalence. Appendix III: Technical Aspects of Bioequivalence Statistics. Brussels. December 1991. Online.
EMEA, CPMP. Note for Guidance on the Investigation of Bioavailability and Bioequivalence. London. 26 July 2001. Online.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2023-06-30 13:50
(737 d 23:56 ago)

@ Helmut
Posting: # 23638
Views: 9,205

SCNR. A heretic alternative.

Post reply

Dear all,

couldn’t resist to try an alternative based on the Hodges-Lehmann estimates instead of medians. Same setup like yesterday.

Simulation settings: 2,500 studies with 16 subjects Sampling every 5 minutes up to 2 × tmax of R (2.50 h), exponentially increasing intervals to tlast (16 h) = 38 samples: 0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150 min, 3.5, 4, 5.5, 7, 9.5, 12.5, 16 h. Seven formulations L = lower limit: tmax = 1.000 h, ka = 2.190 / h, t½,a = 19.0 min T1 = fastest : tmax = 1.125 h, ka = 1.822 / h, t½,a = 22.8 min T2 = fast : tmax = 1.188 h, ka = 1.672 / h, t½,a = 24.9 min R = Reference : tmax = 1.250 h, ka = 1.539 / h, t½,a = 27.0 min T3 = slow : tmax = 1.316 h, ka = 1.417 / h, t½,a = 29.4 min T4 = slowest : tmax = 1.389 h, ka = 1.296 / h, t½,a = 32.1 min U = upper limit: tmax = 1.562 h, ka = 1.065 / h, t½,a = 39.0 min Simulation results: L = lower limit Median : 1.0833 h (Bias: +0.0833 h, Range: 0.7500 - 1.4583 h) HL : 1.1146 h (Bias: +0.1146 h, Range: 0.8333 - 1.4583 h) Skewness: +0.4452 T1 = fastest Median : 1.2083 h (Bias: +0.0833 h, Range: 0.8750 - 1.6250 h) HL : 1.2500 h (Bias: +0.1250 h, Range: 0.9167 - 1.5833 h) Skewness: +0.4093 T2 = fast Median : 1.2917 h (Bias: +0.1042 h, Range: 0.9167 - 1.7917 h) HL : 1.3125 h (Bias: +0.1250 h, Range: 0.9583 - 1.6667 h) Skewness: +0.3858 R = Reference Median : 1.3333 h (Bias: +0.0833 h, Range: 1.0000 - 1.7917 h) HL : 1.3750 h (Bias: +0.1250 h, Range: 1.0417 - 1.7500 h) Skewness: +0.3505 T3 = slow Median : 1.4167 h (Bias: +0.1009 h, Range: 1.0000 - 1.8750 h) HL : 1.4375 h (Bias: +0.1217 h, Range: 1.1250 - 1.7500 h) Skewness: +0.2960 T4 = slowest Median : 1.5000 h (Bias: +0.1111 h, Range: 1.1250 - 2.0000 h) HL : 1.5208 h (Bias: +0.1319 h, Range: 1.1250 - 1.9583 h) Skewness: +0.2594 U = upper limit Median : 1.6667 h (Bias: +0.1042 h, Range: 1.2500 - 2.2500 h) HL : 1.6771 h (Bias: +0.1146 h, Range: 1.3333 - 2.1250 h) Skewness: +0.0878 HL: Hodges-Lehmann estimate Comparisons: According to the guidance (based on medians) passed ‘±20% criterion’ (80.00-125.00%) L = lower limit: 52.1% T1 = fastest : 80.9% T2 = fast : 90.4% T3 = slow : 91.2% T3 = slowest : 84.8% U = upper limit: 57.0% Heretic alternative (based on Hodges-Lehmann estimates) passed ‘±20% criterion’ (80.00-125.00%) L = lower limit: 53.2% T1 = fastest : 87.2% T2 = fast : 94.9% T3 = slow : 95.2% T3 = slowest : 89.5% U = upper limit: 61.1%

Seemingly more powerful.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2022-05-02 15:43
(1161 d 22:03 ago)

@ ElMaestro
Posting: # 22950
Views: 11,619

Simulated distributions

Post reply

Hi ElMaestro,

❝ 2. Do we use logs or not?

To get an idea about the simulated distributions, a large sample size and sampling every ten minutes. Shift of T –15 minutes, spread ±30 minutes.

[image]

Why do we see for T a high density in the first interval? Try this one:

roundClosest <- function(x, y) { res <- y * round(x / y) res[res <= 0] <- y # Force negatives | 0 to the 1st interval return(res) } HL <- function(x, na.action = na.omit) { x <- na.action(x) y <- outer(x, x, "+") res <- median(y[row(y) >= col(y)]) / 2 return(res) } n <- 500 mu.tmax <- 1.5 # hours CV.tmax <- 0.5 shift <- -15 # minutes spread <- 30 # --"-- smpl <- 10 # --"-- # Sample tmax of R from lognormal distribution # parameterized for the median R <- rlnorm(n = n, meanlog = log(mu.tmax), sdlog = sqrt(log(CV.tmax^2 + 1))) # Introduce shift in location of T shifts <- runif(n = n, min = shift / 60 - spread / 60, # can be negative! max = shift / 60 + spread / 60) T <- R + shifts # Discretization by the sampling interval 'smpl' R <- roundClosest(R, smpl / 60) T <- roundClosest(T, smpl / 60) res <- data.frame(min = rep(NA_real_, 3), median = NA_real_, HL = NA_real_, max = NA_real_) row.names(res) <- c("R", "T", "T-R") res[1, c(1:2, 4)] <- quantile(R, probs = c(0, 0.5, 1)) res[1, 3] <- HL(R) res[2, c(1:2, 4)] <- quantile(T, probs = c(0, 0.5, 1)) res[2, 3] <- HL(T) res[3, 2:3] <- res[1, 2:3] - res[2, 2:3] print(round(res, 5)) min median HL max R 0.33333 1.50000 1.58333 7.00000 T 0.16667 1.33333 1.33333 6.66667 T-R NA 0.16667 0.25000 NA

Since some shifts might be negative, I forced them to the first sampling interval – adding to the ones which are already there. Does that make sense?

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes