Helmut ★★★ Vienna, Austria, 20220408 15:17 (48 d 03:45 ago) Posting: # 22918 Views: 1,247 

Dear all, on April 04 the EMA published revised draft productspecific guidances for ibuprofen, paracetamol, and tadalafil. In the footnote on page 1 we find: * This revision concerns defining what is meant by ‘comparable’ T_{max} as an additional main pharmacokinetic variable in the bioequivalence assessment section of the guideline. Then:Bioequivalence assessment: Comparable median (≤ 20% difference) and range for T_{max}. See there why I consider this invention crap. In short: t_{max} follows a discrete distribution on an interval scale. Calculating the ratio of values is a questionable procedure. End of consultation 31 July 2022. — Diftor heh smusma 🖖 _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
ElMaestro ★★★ Denmark, 20220409 11:45 (47 d 07:17 ago) (edited by ElMaestro on 20220409 12:06) @ Helmut Posting: # 22919 Views: 1,074 

Hi Helmut and all, I was a little afraid of this. Thanks a lot for posting. » Comparable median (≤ 20% difference) and range for T_{max} I am confused. I can see the point in regulating the matter, but I feel there is a lot that's left to be answered. But let me ask all of you. Question 1 Do you read this as: You need to be comparable (20% diff) for the median AND for the range? (i.e. is there also a 20% difference requirement for the range???) » Calculating the ratio of values is a questionable procedure. Question 2: Whether we like it or not, we have to find a way forward. And this has a lot of degrees of freedom. In a nonparametric universe where we try to resolve it There could be all sorts of debate re. HodgesLehman, KruskalWallis, Wilcoxon test, confidence levels, bootstrapping, pairing and what not. So, kindly allow me to throw this on the table: Imagine we have these Tmax levels for T and R in a BE trial, units could be hours. Tmax.T=c(2.0, 2.5, 2.75, 2.5, 2.0, 2.5, 1.5, 2.0) Let us implement something that has the look and feel of a test along the lines of what regulators want. Test1=wilcox.test(Tmax.T, alt ="two.sided", conf.int = T, correct=T, conf.level=.90) I am getting a CI of 1.9999292.500036 (2 to 2.5). We can compare this with median(Tmax.R)*c(0.8, 1.2) The test fails. Would that be a way to go? No? then how about: Test2=wilcox.test(Tmax.T/Tmax.R, alt ="two.sided", conf.int = T, correct=T, conf.level=.90) I am getting 0.470.92. Not within 0.8 to 1.25, test fails. Shoot me. How would you prefer to implement the comparability exercise for Tmax? (I am not so much interested in your thoughts on alpha/confidence level, exact T or F, etc. I am mainly interested in a way to make the comparison itself, so please make me happy and focus on that ). Mind you, the data above might be paired .... or it might not, depends on whether it was from an XO or not. This add complexity, all depending on the implementation. And, question 3, if the comparability thing also applies to range, how to implement that? And question 4, sample size calculation is going to get messy for these products, if we have to factor in comparability of Tmax at the 20% level. I am not outright saying I have a bad taste in my mouse, but I am leaning towards thinking this could easy translate into a complete showstopper for sponsors developing the products. What's your gut feeling? At the end of the day answers to Q1Q4 above hinge not only on what you think is the right thing to do; of equal importance is what you think regulators will accept. — Pass or fail! ElMaestro 
Helmut ★★★ Vienna, Austria, 20220409 18:40 (47 d 00:22 ago) @ ElMaestro Posting: # 22920 Views: 1,050 

Capt’n, my capt’n! » » Comparable median (≤ 20% difference) and range for T_{max} » Do you read this as: You need to be comparable (20% diff) for the median AND for the range? (i.e. is there also a 20% difference requirement for the range???) Oh dear, I missed that! The range has a breakdown point of 0 – even if all values are identical except one of them, this single value will change the range. On the other hand, if you have two ’contaminations’ on opposite sides, the range will be the same. script for simulations acc. to my experiences with ibuprofen at the end. Gives:
A goody: Replace in the script the lines
» » Calculating the ratio of values is a questionable procedure. » » Whether we like it or not, we have to find a way forward. And this has a lot of degrees of freedom. I’ve read a lot in the meantime. Still not sure whether it is allowed at all (‼) to calculate a ratio of discrete values with potentially unequal intervals. » In a nonparametric universe where we try to resolve it There could be all sorts of debate re. HodgesLehman, KruskalWallis, Wilcoxon test, confidence levels, bootstrapping, pairing and what not. Didn’t have the stamina to figure out why you get so many warnings in your code. However, you are aware that nonparametrics gives the EMA an anaphylactic shock? » Shoot me. Later. » How would you prefer to implement the comparability exercise for Tmax? (I am not so much interested in your thoughts on alpha/confidence level, exact T or F, etc. I am mainly interested in a way to make the comparison itself, so please make me happy and focus on that ). I’m working on it. » […] if the comparability thing also applies to range, how to implement that? Sorry, I think that’s just bizarre. Honestly, despite you excellent exegesis, I guess (or rather hope?) that only the median is meant. If otherwise, wouldn’t the almighty oracle written this: Comparable (≤ 20% difference) median and range for T_{max}. » […] sample size calculation is going to get messy for these products, if we have to factor in comparability of Tmax at the 20% level. I am not outright saying I have a bad taste in my mouse, but I am leaning towards thinking this could easy translate into a complete showstopper for sponsors developing the products. What's your gut feeling? From some preliminary simulations I guess that we would need somewhat tighter sampling intervals than usual in order to ‘catch’ t_{max} in any and every case. » At the end of the day answers to Q1Q4 above hinge not only on what you think is the right thing to do; of equal importance is what you think regulators will accept. Of course. If we would go back to the 2001 Note for Guidance (and the current one of the WHO), with a nonparametric test and prespecified acceptance range everything would be much easier. P.S.: I updated the article. It’s a work in progress. Perhaps you will come up with more questions.
— Diftor heh smusma 🖖 _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
ElMaestro ★★★ Denmark, 20220409 21:47 (46 d 21:14 ago) @ Helmut Posting: # 22921 Views: 1,007 

Hi Hötzi, should we submit a dataset to EMA and suggest them to publish it along with a description (numerical example) of how exactly they wish to derive the decision? When FDA indicated that they were going in the direction of in vitro popBE for inhalanda and nasal sprays they published a dataset and showed exactly how to process the data to figure out the pass / fail criterion that satisfies the regulator. If EMA would do the same here we'd have all doubt eliminated. I think we need to know exactly: 1. Do we use nonparametrics or not? 2. Do we use logs or not? 3. Is the decision of 20% comparability based on a confidence interval or on something else? 3a. If there is a CI involved, is it a 90% or 95% CI or something else? 4. Are we primarily working on ratio or on a difference? 5. Is the bootstrap involved? 6. How should we treat datasets from parallel trials, and how should we treat data from XO (i.e. how to handle considerations of paired and nonpaired options)? My gut feeling is that they want nonparametrics for the Tmax comparability part (yes I am aware of the sentence). Actually, perhaps they just want the decision taken on basis of the estimates of medians and ranges from min to max? If we submit a dataset, let us make sure we submit one with ties (the one I pasted above had none). — Pass or fail! ElMaestro 
Ohlbe ★★★ France, 20220411 11:29 (45 d 07:33 ago) @ ElMaestro Posting: # 22923 Views: 934 

Hi ElMaestro, » My gut feeling is that they want nonparametrics for the Tmax comparability part (yes I am aware of the sentence). My gut feeling is that all they expect to get are descriptive statistics: report median Tmax for Test, median Tmax for Reference, calculate a % difference (however inappropriate this may be), pass if it is not more than 20%, otherwise fail. Consequence: if you have more than 20% difference between sampling times around the expected Tmax, you're screwed if median Tmax values are different even by just one sampling time, even if this has strictly no clinical relevance (this could be brought up in the comments to the draft guideline: come on guys, are you sure a Tmax of 10' for one formulation and 15' for the other is really something totally unacceptable ? I mean, even for tadalafil you should be able to keep yourself busy until it works). Range: no expectation described. No idea. Of course I may be totally wrong. — Regards Ohlbe 
Helmut ★★★ Vienna, Austria, 20220411 13:59 (45 d 05:03 ago) @ Ohlbe Posting: # 22925 Views: 914 

Hi Ohlbe, » My gut feeling is that all they expect to get are descriptive statistics: report median Tmax for Test, median Tmax for Reference, … So far so good. Standard for ages. » … calculate a % difference (however inappropriate this may be), … It is indeed. » … pass if it is not more than 20%, otherwise fail. That’s my understanding as well. » Consequence: if you have more than 20% difference between sampling times around the expected Tmax, you're screwed if median Tmax values are different even by just one sampling time, … Correct. IMHO, you need
» … even if this has strictly no clinical relevance (this could be brought up in the comments to the draft guideline: come on guys, are you sure a Tmax of 10' for one formulation and 15' for the other is really something totally unacceptable ? Exactly. Recall what the almighty oracle stated in the BEGL: […] if rapid release is claimed to be clinically relevant and of importance for onset of action or is related to adverse events, there should be no apparent difference in median t_{max} and its variability between test and reference product. It boils down to: Is it clinically relevant? If not, a comparison is not required. Furthermore: PK ≠ PD.» I mean, even for tadalafil you should be able to keep yourself busy until it works). Tadalafil shows an effect before t_{max}. So what? Not by chance. It’s common that the time point of E_{max} is < t_{max}. » Range: no expectation described. No idea. The range is completely useless. Like the mean it has a breakdown point of zero. Imagine with \(\small{n\rightarrow \infty}\): $$\small{\left\{R_1=1,\ldots, R_n=1\phantom{.25}\right\} \rightarrow \textrm{Range}(R)=0\phantom{.25}}\\\small{\left\{T_1=1,\ldots, T_n=1.25\right\} \rightarrow \textrm{Range}(T)=0.25}$$ Good luck in calculating a ratio. » Of course I may be totally wrong. So am I. — Diftor heh smusma 🖖 _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
Ohlbe ★★★ France, 20220411 15:48 (45 d 03:14 ago) @ Helmut Posting: # 22926 Views: 871 

Hi Helmut, » » Range: no expectation described. No idea. » » The range is completely useless. Like the mean it has a breakdown point of zero. Imagine with \(\small{n\rightarrow \infty}\): $$\small{\left\{R_1=1,\ldots, R_n=1\phantom{.25}\right\} \rightarrow \textrm{Range}(R)=0\phantom{.25}}\\\small{\left\{T_1=1,\ldots, T_n=1.25\right\} \rightarrow \textrm{Range}(T)=0.25}$$ Good luck in calculating a ratio. I guess it depends what they mean by "range". Are they using the word in its statistical meaning (difference between the largest and smallest values) or in its lay language meaning (limits between which something varies) ? I suspect the latter: lowest and highest observed T_{max} values. — Regards Ohlbe 
Helmut ★★★ Vienna, Austria, 20220411 16:07 (45 d 02:55 ago) @ Ohlbe Posting: # 22927 Views: 871 

Hi Ohlbe, » I guess it depends what they mean by "range". Are they using the word in its statistical meaning (difference between the largest and smallest values) or in its lay language meaning (limits between which something varies) ? I suspect the latter: lowest and highest observed T_{max} values. You mean that we only have to report the minimum and maximum t_{max} of T and R? But how should we understand this: Comparable The range it given in the section ‘Bioequivalence assessment’. We report other stuff as well (λ_{z}/t_{½}, AUC_{0–t}/AUC_{0–∞}, ). So why does it sit there? — Diftor heh smusma 🖖 _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
Ohlbe ★★★ France, 20220411 16:20 (45 d 02:42 ago) @ Helmut Posting: # 22928 Views: 871 

Hi Helmut, » But how should we understand this: Comparable Oh, the exact same way as before: nobody knows. — Regards Ohlbe 
Helmut ★★★ Vienna, Austria, 20220411 13:03 (45 d 05:59 ago) @ ElMaestro Posting: # 22924 Views: 922 

Hi ElMaestro, » should we submit a dataset to EMA and suggest them to publish it along with a description (numerical example) of how exactly they wish to derive the decision? Maybe better not only a data set but also a proposed method for evaluation. » When FDA indicated that they were going in the direction of in vitro popBE for inhalanda and nasal sprays they published a dataset and showed exactly how to process the data to figure out the pass / fail criterion that satisfies the regulator. I have a counterexample. This goody was recommended in Apr 2014 and revised in Jan 2020. Nobody knew how the FDA arrived at the BElimits and why. Last month I reviewed a manuscript explaining the background. Made sense but for years it was mystery. » If EMA would do the same here we'd have all doubt eliminated. Utopia. Notoriously the EMA comes up with unsubstantiated ‘inventions’ and leaves it to us to figure out if and how they work. Examples? Rounded regulatory constant k = 0.760 and upper cap at CV_{wR} = 50% in referencescaling, sequence(stage) term in TSDs, ‘substantial’ accumulation if AUC_{0–τ} > 90% of AUC_{0–∞}, for partial AUCs default cutoff time τ/2, C_{ss,min} for originators but C_{ss,τ} for generics, you name it. » I think we need to know exactly: » 1. Do we use nonparametrics or not? Guess. » 2. Do we use logs or not? Logs? Possibly t_{max} follows a Poisson distribution. » 3. Is the decision of 20% comparability based on a confidence interval or on something else? Likely the former; made up out of thin air. » 3a. If there is a CI involved, is it a 90% or 95% CI or something else? 90%. » 4. Are we primarily working on ratio or on a difference? IMHO, calculating ratios of discrete values with potentially unequal intervals is nonsense. » 5. Is the bootstrap involved? A possible approach but why? » 6. How should we treat datasets from parallel trials, and how should we treat data from XO (i.e. how to handle considerations of paired and nonpaired options)? I would suggest the Mann–Whitney U test (parallel) and the Wilcoxon signedrank test (paired / crossover). Requires some tricks in case of tied observations (practically always) for the exact tests, e.g., function wilcox_test() of package coin instead of wilcox.test() .» My gut feeling is that they want nonparametrics for the Tmax comparability part (yes I am aware of the sentence). I doubt it. Really. » Actually, perhaps they just want the decision taken on basis of the estimates of medians and ranges from min to max? I think so (see also Ohlbe’s post). However, that’s statistically questionable (politely speaking). See the updated article and hit to clear the browser’s cached version. » If we submit a dataset, let us make sure we submit one with ties (the one I pasted above had none). It’s extremely unlikely that you will find one without… I explored one of my studies. Ibuprofen 600 mg tablets, single dose, fasting, 2×2×2 crossover, 16 subjects (90% target power for C_{max}), sampling every 15 minutes till 2.5 hours. Resampled the reference’s t_{max} in 10^{4} simulations and applied the ±20% criterion:
Bonus question: Which distribution? The generic tested in this study was approved 25 years ago and is still on the market. Any problems? If you want to give it a try:
P.S.: Amazing that this zombie rises from the grave. See this post and this thread of June 2013… — Diftor heh smusma 🖖 _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
Helmut ★★★ Vienna, Austria, 20220430 14:59 (26 d 04:03 ago) @ ElMaestro Posting: # 22945 Views: 451 

Hi ElMaestro, » 2. Do we use logs or not? As the ‘Two Lászlós’ – who else? – wrote* The concentrations rise more steeply before the peak than they decline following the true maximum response. Consequently, it is more likely that large observed concentrations occur after than before the true peak time (\(\small{T_\textrm{max}^\circ}\)). Apart from this theoretical consideration they demonstrated in simulations that the distribution of the observed t_{max} is not only biased but skewed to the right.I could confirm that in my studies. Given that, your idea of using logs in simulations (but not in the evaluation) is excellent! It turned out that a CV of 50% is not uncommon – even with a tight sampling schedule. I set up simulations. Say, we have a drug with an expected median t_{max} of R of 1.5 hours. If we aim to power the study at 90% for a moderate CV of C_{max} of 20% we need 26 subjects. Let’s sample every 15 minutes and assume a shift in t_{max} for T of 15 minutes (earlier). Result of 10,000 simulations:
How many subjects would we need to preserve our target power?
Another issue is the assumed shift in location, which is nasty. I tried –20 minutes (instead of –15). I stopped the iterations with 500 (‼) subjects (power ≈82%). I’m not sure whether 90% is attainable even with a more crazy sample size. Note that the xaxis is in logscale. On the other hand, the spread is less important, since we are assessing medians. In other words, extreme values are essentially ignored. Then with the original shift of –15 minutes but a spread of ±60 minutes (instead of 30), we have already ≈80% power with 36 subjects and ≈90% with 66. Is this really the intention? The wider the range in t_{max}, the more easily products will pass. Counterintuitive. Furthermore, as a consequence of the skewed distribution power curves are not symmetrical around zero – what the PKWP possibly (or naïvely?) assumed. It reminds me on the old days when the BElimits were 80–120% and maximum power was achieved at a T/Rratio of ≈0.98 (see there). Consequently, a ‘faster’ test will more likely pass than a ‘slower’ one. That’s not my understanding of equivalence. Guess the Type I Error.
— Diftor heh smusma 🖖 _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
ElMaestro ★★★ Denmark, 20220430 19:10 (25 d 23:52 ago) @ Helmut Posting: # 22946 Views: 422 

Hi Hötzi, your post is potentially highly significant. » Is this really the intention? The wider the range in t_{max}, the more easily products will pass. Counterintuitive. The intention, as I understand it, was exactly the opposite. It goes completely against all intention, doesn't it? I think this knowledge, if it holds in confirmatory simulations, should be published quickly and made available to regulators. I very much hope that regulators will abstain completely from letting pride prevail over the regard for the EU patient. So I hope they will not dismiss the argumentation. For example, I could fear they would dismiss your findings because they don't have a palate for simulations (but note they like simulations well enough when it comes to f2; bootstrapping is a simulation, too). — Pass or fail! ElMaestro 
Helmut ★★★ Vienna, Austria, 20220501 15:56 (25 d 03:06 ago) @ ElMaestro Posting: # 22947 Views: 381 

Hi ElMaestro, » your post is potentially highly significant. Potentially! » » Is this really the intention? The wider the range in t_{max}, the more easily products will pass. » » The intention, as I understand it, was exactly the opposite. It goes completely against all intention, doesn't it? Right, I don’t get it. I’m afraid, this is one joins the ‘methods’ of the PKWP made up out of thin air. EMEA. The European Medicines Evaluation Agency. It’s complicated. Like yesterday but this time no shift. That’s what I expect. » I think this knowledge, if it holds in confirmatory simulations, should be published quickly and made available to regulators. That’s wishful thinking taking the review process into account. Deadline for comments July 31^{st}. » […] I could fear they would dismiss your findings because they don't have a palate for simulations (but note they like simulations well enough when it comes to f2; bootstrapping is a simulation, too). Yep. ƒ_{2} is not a statistic (depends on the number of samples and intervals). Given that, no closed form to estimate the location, its CI, power (and hence, the Type I Error) exists because the distribution is unknown. That’s similar to the situation we are facing here. PS: Du hast Mehl in deiner Mehlkiste. — Diftor heh smusma 🖖 _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
Helmut ★★★ Vienna, Austria, 20220502 13:43 (24 d 05:19 ago) @ ElMaestro Posting: # 22950 Views: 337 

Hi ElMaestro, » 2. Do we use logs or not? To get an idea about the simulated distributions, a large sample size and sampling every ten minutes. Shift of T –15 minutes, spread ±30 minutes. Why do we see for T a high density in the first interval? Try this one:
Since some shifts might be negative, I forced them to the first sampling interval – adding to the ones which are already there. Does that make sense?— Diftor heh smusma 🖖 _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 