Bioequivalence and Bioavailability Forum • ICH M13A: Changes to Step 2

Helmut
★★★

Vienna, Austria,
2024-07-31 15:16
(244 d 19:15 ago)

Posting: # 24112
Views: 6,792

ICH M13A: Changes to Step 2 [BE/BA News]

Dear all,

M13A Bioequivalence for Immediate-Release Solid Oral Dosage Forms

was adopted on 23 July 2024 and is thus, in Step 4 (final). It was published today. The draft (Step 2 of 20 December 2022) is no more linked on the ICH’s website but is – as of today – still available.

I suggest that we discuss what has changed (the supporting Q&A document might give hints why) and the impact on future studies.

I start with my favorite, the dreadful Group-by-Treatment interaction.

Draft (2.2.3.5 Multi-Group Design Studies, page 16)
BE should be determined based on the overall treatment effect in the whole study population. In general, the assessment of BE in the whole study population should be done without including the Group by Treatment interaction term in the model, but applicants may also use other pre-specified models, as appropriate. However, the appropriateness of the statistical model should be evaluated to account for the multi-group nature of the BE study. Applicants should evaluate potential for heterogeneity of treatment effect across groups, i.e., Group by Treatment interaction. If the Group by Treatment interaction is significant, this should be reported and the root cause of the Group by Treatment interaction should be investigated to the extent possible. Substantial differences in the treatment effect for PK parameters across groups should be evaluated. Further analysis and interpretation may be warranted in case heterogeneity across groups is observed.
Final (page 13)
BE should be determined based on the overall treatment effect in the whole study population. The statistical model should take into account the multi-group nature of the BE study, e.g., by using a model including terms for group, sequence, sequence × group, subject within sequence × group, period within group and formulation. The group × treatment interaction term should not be included in the model. However, applicants should evaluate potential for heterogeneity of treatment effect across groups and discuss its potential impact on the study data, e.g., by investigation of group × treatment interaction in a supportive analysis and calculation of descriptive statistics by group.

In the Questions and Answers we find:

In a single-site study, dosing subjects in groups may be unavoidable for logistic reasons. The following measures should be considered to minimize group effects:

Start dosing all groups at the same clinic over a specific time span, e.g., within a few weeks.
Follow the same protocol requirements and procedures for all groups, and recruit subjects from the same enrollment pool thereby achieving similar demographics among groups.
Randomly assign subjects to group and treatment arm (or treatment sequence) at the study outset.

Assign an equal sample size to each group when feasible, e.g., when healthy subjects are enrolled.

A clear improvement. However, in a meta-analysis of more than 320 studies we found an average loss of ≈6% power by using this model compared to the conventional one (without group-terms).* In the meantime I collected data of more studies (see this post).
Although I’m not happy with the last sentence of this section, we will have to live with it. I’m not sure what is meant by »calculation of descriptive statistics by group«. Geometric means of PK metrics (irrespective of treatment), separate for treatments, or point estimates by the conventional model?
Post hoc analyses regularly lead to endless and – quite often fruitless – discussions. Be prepared for them in ≈5% of your studies (i.e., at the level $\small{\alpha}$ of the $\small{G\times T}$-test; see this article).

Schütz H , Burger DA , Cobo E , Dubins DD , Farkás T, Labes D , Lang L, Ocaña J, Ring A , Shitova A , Stus V, Tomashevskiy M. Group-by-Treatment Interaction Effects in Comparative Bioavailability Studies. AAPS J. 2024; 26(3): 50. doi:10.1208/s12248-024-00921-x. Open Access. Supplementary Material.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2024-07-31 16:19
(244 d 18:11 ago)

@ Helmut
Posting: # 24115
Views: 6,270

ICH M13A: Changes to Step 2

Post reply

Dear all,

some more changes I [sic] think being relevant.

<nitpick>
- PK parameters are estimated with a PK model. By a noncompartmental analysis (NCA) – as required in the guideline – we obtain PK metrics or characteristics.
- »BE study« is a misnomer. Bioequivalence is the desired result of a comparative biovailability study. Only Health Canada consistently uses ‘comparative BA’.
- »Apparent terminal elimination rate constant k_el« is sloppy terminology because it smells of PK modeling. In software for NCA (e.g., Phoenix WinNonlin, PKanalix, PKNCA) λ_z is used.
</nitpick>

Draft (2.1.1 Study Population, page 3)
If a drug product is intended for use in both sexes, it is recommended the study include male and female subjects.
Final (page 3)
If a drug product is intended for use in both sexes, the inclusion of male and female subjects in the study should be considered.

That’s a more permissive wording and would help to deal with recruitment issues in some areas (e.g., the Middle East, India).

Draft (2.1.2 Study Design, page 4)
A randomised, single-dose, two-period, two-sequence crossover study design is recommended…
Final (page 3)
A randomised, single-dose, ~~two-period, two-sequence~~ crossover study design is recommended…

The correction is much appreciated. Another well-established design (i.e., a higher-order crossover with comparators from multiple regions) was already mentioned in the draft Section 2.2.5.1. Replicate designs for ABE in order to reduce the sample size are not mentioned in the GL. Reference-scaling for HVD(P)s and approaches for NTIDs are not covered in the guideline; details will be given in M13C.

Final (2.1.2 Study Design, page 3)
In general, whether steady-state has been achieved is assessed by comparing at least three pre-dose concentrations for each formulation.

This sentence is new. See Section 2.2.2.2 why.

Draft (2.1.3 Sample Size for Bioequivalence Studies, page 4)
The number of subjects […] should be based on an appropriate sample size calculation…
Final (page 3)
The number of subjects […] should be based on an appropriate sample size determination…

I like the subtle difference. However, I hoped for more details for non-statisticians. The old Note for Guideline of the EMEA, as well as the current guidelines of the WHO and Health Canada are more specific.

Draft (Standardisation with regard to meals and water, page 7)
For studies conducted under fasting conditions, subjects should be fasted for at least 10 hours before drug administration.
[…] the BE study conducted under fed conditions should employ a meal that has the potential to cause the greatest effect on GI physiology. The meal should be a […] high-calorie (approximately 800 to 1000 kcal) meal, which should derive approximately 150, 250, and 500–600 kcal from protein, carbohydrate, and fat, respectively.
Final (page 6–7)
For studies conducted under fasting conditions, subjects should be fasted for at least 8 hours before drug administration.
[…] the BE study conducted under fed conditions should employ a meal that has the potential to cause the greatest effect on GI physiology. The meal should be a […] high-calorie (approximately 900 to 1000 kcal) meal, which should derive approximately 150, 250, and 500–600 kcal from protein, carbohydrate, and fat, respectively.
[…] Further, since drug absorption can be impacted by GI transit times and regional blood flows, posture and physical activity need to be standardised.

Less starving of volunteers.
I guess the 800 kcal in the draft were introduced by someone numerically handicapped.
Posture control is applied for ages anyhow.

Draft (2.1.6 Dose or Strength to be Studied, page 9)
For non-proportional increases in AUC and/or C_max with increased dose there may be a difference between different strengths in the sensitivity to detect potential differences between formulations. To assess dose proportionality, the applicant should consider all available data regarding dose proportionality. Assessment of dose proportionality should consider single-dose studies only.
Final (page 7)
To determine dose proportionality in PK, the applicant should refer to the approved drug product labelling for the comparator. If such information is lacking, the applicant should consider all available sources of data. Assessment of dose proportionality should generally consider single-dose studies and should consider C_max and AUC as appropriate PK parameters for this purpose. In general, PK can be considered dose proportional if the difference in dose-adjusted mean C_max and AUC is no more than 25% when comparing the range of strengths proposed. For the purpose of an additional strength waiver, AUC and C_max are evaluated to demonstrate proportionality, however, should the available data establish dose proportional PK for AUC but the available data for C_max are insufficient, e.g., due to variability, to make a conclusion, the PK can be treated as dose proportional. If data are not available to establish dose proportionality, then BE studies should be conducted with the lowest and highest strengths of the proposed series of strengths.
For non-proportional increases in AUC and/or C_max with increasing dose there may be a difference between strengths in the sensitivity to detect potential differences between formulations. ~~To assess dose proportionality, the applicant should consider all available data regarding dose proportionality. Assessment of dose proportionality should consider single-dose studies only.~~

The wording of the new paragraph follows essentially the EMA’s GL.

Draft (2.1.8 Sampling, page 10)
The sampling schedule in a BE study should cover the concentration-time curve, including a pre-dose sample, samples in the absorption phase, frequent samples around the expected T_max, and sufficient samples after T_max to ensure a reliable estimate of the extent of exposure, which is achieved when AUC_(0–t) covers at least 80% of AUC_(0–inf). This period is usually at least three times the terminal half‐life of the drug, unless a suitable truncated AUC, e.g., AUC_(0–72h), is used.
The exact times at which the samples are taken should be recorded to obtain the elapsed time relative to drug administration and sampling should be spaced such that C_max, AUC_(0–t), and k_el can be estimated accurately.
Final (2.1.8 Considerations for Sampling Schedule, page 8)
The sampling schedule in a BE study should cover the concentration-time curve, including a pre-dose sample, samples in the absorption phase, frequent samples around the expected time to maximum observed concentration (t_max) and sufficient samples to ensure a reliable estimate of the extent of exposure, which is achieved when AUC_(0–t) covers at least 80% of AUC_(0–inf). The sampling period should generally be at least three times the terminal elimination half‐life of the drug, unless a suitable truncated AUC, i.e., AUC_(0–72h), is used.
The exact times at which the samples are taken should be recorded to obtain the elapsed time relative to drug administration and sampling should be spaced such that C_max, AUC_(0–t), and the apparent terminal elimination rate constant (k_el) can be estimated accurately.

t_max instead of T_max is appreciated (t is the SI abbreviation of time and T the one of absolute temperature).
The requirement of $\small{AUC_{{0-}\text{t}}\ge 80\%\,AUC_{0-\infty}}$ appeared out of blue skies in the APV guideline 37 (‼) years ago without any justification.¹ Copy & paste in guidelines (EMA, WHO, Health Canada, ANVISA, Japan, and this one)? It was never required by the FDA.
This requirement is questionable because at $\small{2-4\times t_\text{max}}$ absorption is practically complete^3,4 (depending on the half life we have at $\small{2\times t_\text{max}\text{:}\approx97.5\%}$ absorbed, at $\small{3\times t_\text{max}\text{:}\approx99.6\%}$, and at $\small{4\times t_\text{max}\text{:}\approx99.9\%}$). After that we see only elimination (and distribution in a two compartment model), which is (are) drug-specific and thus, simply not relevant for the comparison of formulations. It can be shown that the ≥80% requirement translates to $\small{>4\times t_\text{max}\to\,>99.99\%}$, which is extremely conservative, and, IMHO, not justified for IR products.
Example: Absorption t_½ 1 h, elimination t_½ 4 h, sampling according to the guideline four times the elimination half life.

[image]

In a nutshell: A »reliable estimate of the extent of exposure« could readily be »ensured« if the sampling would end (much) earlier. For a given number of sampling times points it would be better to have more around t_max…

Draft (2.1.8.1 First Point C_max, page 11)
The sampling schedule should include frequent sampling around the anticipated T_max to provide a reliable estimate of C_max. In particular, the occurrence of C_max at the first post-dose sampling time point should be avoided by careful consideration of the known pharmacokinetic properties of the drug and selection of a suitable early sampling schedule. Datasets where C_max occurs at the first post-dose sampling time may result in exclusion of the data from affected subjects from the analysis.
Final (page 8)
The sampling schedule should include frequent sampling around the anticipated t_max to provide a reliable estimate of C_max. In particular, the occurrence of C_max at the first post-dose sampling time point should be avoided by careful consideration of the known PK properties of the drug and selection of a suitable early sampling schedule. For example, for drug products with rapid absorption, collection of blood samples at an early time point, between 5 and 15 minutes after dosing, followed by additional sample collections, e.g., two to five samples in the first hour after dosing, is usually sufficient to assess peak drug concentrations. When absorption is rapid, time points earlier than 5 minutes are generally not expected.
For subjects where C_max occurs at the first post-dose sampling time, the actual C_max may have been missed as it could have occurred at an earlier time point. When this occurs, the robustness of the study results in relation to the potential missed C_max should be discussed. This could include [an] additional analysis where data from the affected subjects are removed from the analysis.

Does that mean the complete data set is primary and the one with subject(s) excluded $\small{\tfrac{\textsf{should}}{\textsf{could}}}$ presented as a sensitivity analysis? What if results contradict? I like discussions…

Draft (page 11), Final (page 9): 2.1.8.2 Long Half life Drugs and Truncated AUC Considerations

Unchanged, i.e., AUC_(0–72h) instead of AUC_(0–t) only for drugs with a half life ≥ 24 hours. This is a step backwards from other guidelines (EMA, WHO), where AUC_(0–72) can be used independent from the half life. The FDA, Health Canada, ANVISA, and Japan prevailed.

Draft (2.1.8.3 Early Exposure, page 11)
For orally administered IR drug products, BE can generally be demonstrated by measurement of rate and extent of absorption, i.e., C_max and AUC_(0–t). However, in some situations, C_max and AUC_(0–t) may be insufficient to adequately assess the BE between two products, e.g., when the early onset of action is clinically relevant. In these cases, an additional PK parameter, such as area under the concentration vs. time curve between two specific time points (pAUC), may be applied. This pAUC is typically evaluated from the time of drug administration until a predetermined time-point that is related to a clinically relevant pharmacodynamic measure.
Final (page 9)
For orally administered IR drug products, BE can generally be demonstrated by measurement of rate and extent of absorption, i.e., C_max and AUC_(0–t). However, in some situations, C_max and AUC_(0–t) may be insufficient to adequately assess the BE between two products, e.g., when the early onset of action is clinically relevant. In these cases, an additional PK parameter, such as area under the concentration vs. time curve between two specific time points (pAUC) or t_max, may be applied. In the case of pAUC, it is typically evaluated from the time of drug administration until a predetermined time point that is related to a clinically relevant pharmacodynamic measure.

Once implemented, the FDA will have to remove AUC_(0–∞), which was a primary PK metric since 1992. A step forwards because

it is generally more variable than AUC_(0–t) and
sometimes λ_z could not be reliably estimated in all subjects, negatively affecting power for this PK metric.

Partial AUC was a metric of early exposure for the FDA and Health Canada, although the method of selecting the cut-off time differed (at the median t_max of the reference and the t_max of the reference for each subject). Basing the cut-off on PD was introduced in the FDA’s NDA/IND guidance of 2022. However, if a clear PK/PD relationship is lacking, the selection of the cut-off time is challenging at least.⁴
t_max was required by e.g., the EMA, the WHO, and in Australia.

Draft (2.2.1 Considerations for the Bioequivalence Analysis Population, page 12)
Any exclusions from the BE analysis population should be documented prior to bioanalytical analysis, e.g., subjects that are withdrawn from the study, have protocol violations, or experience GI disturbances potentially affecting absorption.
Final (page 9)
Any exclusions from the BE analysis population, e.g., subjects that are withdrawn from the study, have protocol violations, or experience GI disturbances potentially affecting absorption, should be documented prior to bioanalytical analysis.

Same content, changed order. Fine with me.

Draft (page 12), Final (page 10): 2.2.1.1 Removal of Data Due to Low Exposure
The exclusion of data for this reason will only be accepted in exceptional cases, in general with no more than 1 subject in each study, and may bring the reliability of dose administration into question.

Regrettably unchanged. Some comparators (e.g., of dasatinib) are lousy and regularly more than one subject show low exposure. This is well known from the literature and should not question »the reliability of dose administration«. Limiting the exclusion to only one subject (irrespective of the sample size) is not helpful.

Draft (page 12), Final (page 10): 2.2.2.1 Concentration Time Data
Two concentration-time graphs (linear and log-linear) should be provided for both the test and comparator products for each individual subject. In addition, two concentration-time graphs (linear and log-linear) should be provided for both the test and comparator products for the mean drug concentrations of all subjects.

Good luck for the log-linear graphs if (according to Section Section 2.2.2.2) »concentration reported as below the lower limit of quantification (LLOQ) should be treated as zero«. BTW, which mean? The arithmetic mean is nonsense for concentrations. It implies a normal distribution with a certain probability of negative values because the domain of $\small{\mathcal{N}(\mu;\sigma^2)}$ is $\small{[-\infty<x+\infty]}$ for $\small{x\in \mathbb{R}}$. I hope that the geometric mean is meant because concentrations follow a lognormal distribution for $\small{x\in \mathbb{R}^{\color{Red}{\textbf{+}}}}$. PK software (e.g., Phoenix WinNonlin, PKanalix, [image]

PKNCA) automatically exclude values with a non-numeric code like ‘LLOQ’ but obviously cannot deal with $\small{\log_e(0)}$.

Draft (2.2.2.2 Pharmacokinetic Analysis, page 13–14)
For single-dose studies, the following PK parameters should be tabulated for each subject-formulation combination: 1) primary parameters for analysis: AUC_(0–t), C_max, and, where applicable, pAUC, and 2) additional parameters for analysis to assess the acceptability of the bioequivalence study: AUC_(0–inf), AUC_(0–t)/AUC_(0–inf), T_max, k_el, and t_1/2.
Summary statistics to be reported include geometric mean, median, arithmetic mean, standard deviation, coefficient of variation, number of observations, minimum, and maximum. […] The non-compartmental methods used […] should be reported, e.g., linear trapezoidal method for AUC and the number of data points of the terminal log-linear phase used to estimate the terminal elimination rate constant (k_el).
For multiple-dose studies, applicants should document appropriate dosage administration and sampling to demonstrate the attainment of steady-state. For steady-state studies, the following PK parameters should be tabulated: 1) primary parameters for analysis: C_maxSS and AUC_(0–tauSS), and 2) additional parameters for analysis: C_tauSS, C_minSS, C_avSS, degree of fluctuation, swing, and T_max.
Any concentration reported as below the lower limit of quantification (LLOQ) should be treated as zero in PK parameter calculations. Values below the LLOQ are to be omitted from the calculation of k_el and t_1/2.
Final (page 11)
For single-dose studies, the following PK parameters should be tabulated for each subject-formulation combination: 1) primary parameters for BE analysis: AUC_(0–t), C_max, and, where applicable, early exposure parameters (see Section 2.1.8.3), and 2) additional parameters for analysis to assess the acceptability of the bioequivalence study: AUC_(0–inf), AUC_(0–t)/AUC_(0–inf), t_max, k_el, and t_1/2. For single-dose studies, AUC_(0–t) should cover at least 80% of AUC_(0–inf). If the AUC_(0–t)/AUC_(0–inf) percentage is less than 80% in more than 20% of the observations, then the validity of the study may need to be discussed in the submission. If the AUC is truncated at 72 hours for long half-life drugs, the primary AUC parameter for analysis is AUC_(0–72h) and the following additional parameters are not required: AUC_(0–inf), AUC_(0–t)/AUC_(0–inf), k_el, and t_1/2.
Summary statistics to be reported include number of observations, geometric mean, coefficient of variation, median, arithmetic mean, standard deviation, minimum, and maximum. […] The non-compartmental methods used […] should be reported, e.g., linear trapezoidal method for AUC and the number of data points of the terminal log-linear phase used to estimate k_el. For multiple-dose studies, applicants should document appropriate dosage administration and sampling to demonstrate the attainment of steady-state. For steady-state studies, the following PK parameters should be tabulated: 1) primary parameters for analysis: C_maxSS and AUC_(0–tauSS), and 2) additional parameters for analysis: C_tauSS, C_minSS, C_avSS, degree of fluctuation, swing, and t_max.
[Last sentence unchanged]

For my thinking about $\small{\frac{AUC_{0-\text{t}}}{AUC_{0-\infty}}\,\ge 80\%\,AUC_{0-\infty}}$ see above. Possibly we will see ‘regulatory creep’, i.e., »may need to be discussed« ⇒ »has to be discussed«.
The method of calculating $\small{AUC_{0-\infty}}$ is nowhere given. Should it be the simple $\small{AUC_{0-\text{t}}+C_\text{t}/\lambda_\text{z}}$ or can it be based on the estimated last concentration, i.e., $\small{AUC_{0-\text{t}}+\widehat{C_\text{t}}/\lambda_\text{z}}$ – as recommended in the Canadian guidance, publications, and textbooks (see this article)? IMHO, it should unambiguously stated in the protocol.
What’s the purpose of reporting the arithmetic mean for PK metrics (C_max, AUC, pAUC) following a lognormal distribution?
At least the linear trapezoidal method is only given as an example. The linear-up logarithmic-down trapezoidal method is less biased, especially if there are deviations from the sampling schedule and/or concentrations are missing (see this article). IMHO, the method should not only be reported but already stated in the protocol. If in a subject t_last is not the same after all treatments, the T/R-ratio of AUC_(0–t) will unavoidably be biased. Alas, an unbiased approach⁵ did not make it to the GL.
The swing $\small{100\frac{C_\text{max}-C_\text{min}}{C_\text{min}}}$ is a terrible PK metric with extreme variability (esp. in case of low accumulation).⁶ Given, only to be reported. But for what purpose?
»[…] applicants should […] demonstrate the attainment of steady-state.« Regrettably it is not stated how that should be done. For the problems see this article.
Concentrations < LLOQ ⇒ 0. I beg your pardon?

After a dose we know only one thing for sure: The concentration is not zero.⁷

»Values below the LLOQ are to be omitted from the calculation of k_el and t_1/2.« What else? Try a log-linear regression with a ‘zero concentration’. Good luck.

Draft (2.2.3.1 General Considerations, page 15)
The model to be used for the statistical analysis should be pre-specified in the study protocol. The statistical analysis should take into account sources of variation that can be reasonably assumed to have an effect on the response variable.
Final (page 12)
[First two sentences unchanged]
Post hoc and data-driven adjustments are not acceptable for the primary statistical analysis.

Is it reasonable to assume that multiple groups can effect the response variable? SCNR. See also the previous post.

Draft (2.2.3.2 Cross over Design Studies, page 15)
Conventional two-treatment, two-period, two-sequence randomised crossover design studies should be analysed using an appropriate parametric method, e.g., ANOVA. The tables resulting from such analyses including the appropriate statistical tests of all effects in the model should be submitted, e.g., a summary of the testing of Sequence, Subject within Sequence, Period, and Formulation effects should be presented.
Final (page 12)
Randomised, non-replicate, crossover design studies should be analysed using an appropriate parametric method, e.g., general linear model (GLM) or mixed model.
[Second sentence unchanged]

Why is »non-replicate« stated here again? Already given in the the title of Section 2.2. The mixed model likely was a concession made to the FDA and Health Canada, the only agencies currently requiring it.
Testing for the effects is ridiculous. AFAIK, currently required only by Health Canada – including an ‘explanation’ of significant ones. The outcome of a comparative BA study is dichotomous. Either it passed (BE) or not… The sequence and formulation effects are not relevant and the period effects cancel out.

Draft (2.2.3.3 Carry over, page 15)
If there are subjects for whom the pre-dose concentration is greater than 5% of the C_max value for the subject in that period, then the pivotal statistical analysis should be performed excluding the data from that subject.
Final (page 12)
In single-dose studies, if there are subjects for whom the pre-dose concentration is greater than 5% of the C_max value for the subject in that period, then the primary statistical analysis should be performed excluding the data from that period, which may result in the exclusion of the subject as discussed in Section 2.2.3.2.

What is discussed in this respect in Section 2.2.3.2? Am I blind?

Draft (2.2.3.4 Parallel Design Studies, page 16)
The statistical analysis for parallel design studies should reflect independent samples. Demographic characteristics or other relevant covariates known to affect the PK should be balanced across groups, to the extent possible. The use of stratification in the randomisation procedure based on a limited number of known relevant factors is therefore recommended. Those factors are also recommended to be accounted for in the pre-defined primary statistical analysis. Post hoc and data-driven adjustments are not acceptable for the primary statistical analysis.
Final (page 12–13)
The statistical analysis for randomised, parallel design studies should reflect independent samples. Demographic characteristics or other relevant covariates known to affect the PK should be balanced across groups, to the extent possible. The use of stratification in the randomisation procedure based on a limited number of known relevant factors is therefore recommended. Those factors are also recommended to be accounted for in the primary statistical analysis. ~~Post hoc and data-driven adjustments are not acceptable for the primary statistical analysis.~~

Since there is a »primary statistical analysis«, may I ask: What is the secondary one?
I miss a statement that equal variances must not be assumed (i.e., that the confidence interval has to be calculated by the Welch-test instead of by the t-test). In case of unequal variances and/or group sizes the latter is liberal (anticonservative).

To be continued… Feel free to chime in.

Junginger H. Studies on Bioavailability and Bioequivalence – APV Guideline. Drugs Made in Germany. 1987; 30: 161–6.
Midha KK, Hubbard JW, Rawson MJ. Retrospective evaluation of relative extent of absorption by the use of partial areas under plasma concentration versus time curves in bioequivalence studies on conventional release products. Eur J Pharm Sci. 1996; 4(6): 381–4. doi:10.1016/0928-0987(95)00166-2.
Scheerans C, Derendorf H, Kloft C. Proposal for a Standardised Identification of the Mono-Exponential Terminal Phase for Orally Administered Drugs. Biopharm Drug Dispos. 2008; 29(3): 145–57. doi:10.1002/bdd.596.
Yu LX, Li BV, editors. FDA Bioequivalence Standards. New York: Springer; 2014. ISBN 978-1-4939-1251-0. p. 16.
Fisher D, Kramer W, Burmeister Getz E. Evaluation of a Scenario in Which Estimates of Bioequivalence Are Biased and a Proposed Solution: tlast (Common). J Clin Pharm. 2016; 56(7): 794–800. doi:10.1002/jcph.663.
Endrényi L, Tóthfalusi L. Metrics for the Evaluation of Bioequivalence of Modified-Release Formulations. AAPS J. 2012; 14(2): 813–9. doi:10.1208/12248-012-9396-8.
Boxenbaum H. at: AAPS, FDA, FIP, HPB, AOAC. Analytical Methods Validation: Bioavailability, Bioequivalence and Pharmacokinetic Studies. (Crystal City I). Arlington, VA. December 3–5, 1990.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

mittyri
★★

Russia,
2024-08-05 23:08
(239 d 11:22 ago)

@ Helmut
Posting: # 24136
Views: 5,760

AUCres

Post reply

Dear Helmut,

❝ – Final (page 11)

❝ For single-dose studies, the following PK parameters should be tabulated for each subject-formulation combination: 1) primary parameters for BE analysis: AUC_(0–t), C_max, and, where applicable, early exposure parameters (see Section 2.1.8.3), and 2) additional parameters for analysis to assess the acceptability of the bioequivalence study: AUC_(0–inf), AUC_(0–t)/AUC_(0–inf), t_max, k_el, and t_1/2. For single-dose studies, AUC_(0–t) should cover at least 80% of AUC_(0–inf). If the AUC_(0–t)/AUC_(0–inf) percentage is less than 80% in more than 20% of the observations, then the validity of the study may need to be discussed in the submission.

ICH group was copying many things from EMA BE Guideline (and this is good). But what prompted the ICH group to replace the Residual Area term with Ratio? Was there some kind of dissatisfaction with the 80% coverage criterion and the ambiguity surrounding Residual Area? :-D

I was always wondering about this kind of discussion. One more theme for sensitivity analysis? :cool:

—
Kind regards,
Mittyri

Helmut
★★★

Vienna, Austria,
2024-08-06 00:13
(239 d 10:17 ago)

@ mittyri
Posting: # 24137
Views: 5,719

‘Percentage covered’

Post reply

Hi Mittyri,

❝ ICH group was copying many things from EMA BE Guideline (and this is good).

✔

❝ But what prompted the ICH group to replace the Residual Area term with Ratio? Was there some kind of dissatisfaction with the 80% coverage criterion and the ambiguity surrounding Residual Area? :-D

Not the slightest idea. I know a few members of the group and can ask. Answer not guaranteed. ;-)

Edit 1: Perhaps the members were not familiar with what is given in the output of PK software (Phoenix WinNonlin, PKanalix, PKNCA ( [image]

), ncappc ( [image]

), NonCompart ( [image]

), ncar (

), Pumas (Julia), …):

AUClast, AUCINF_obs, AUCINF_pred, AUC_%Extrap_obs, AUC_%Extrap_pred or similar

Since the ‘percentage covered’ ($\small{100\times AUC_{{0-}\text{t}}/AUC_{0-\infty}}$) is not part of the output, we have to set up any of these transformations (terminology used by the first three goodies above):

100 * AUClast / AUCINF_obs
100 * AUClast / AUCINF_pred
100 - AUC_%Extrap_obs
100 - AUC_%Extrap_pred

Until software is updated to automatically provide something like AUC_%Covered_obs and AUC_%Covered_pred we have to live with it.

An example of the setup in Phoenix WinNonlin (at the bottom the User Defined Parameters and on top the last lines of the Core output):
[image]
Output Data Final → Final Parameters Pivoted (last columns)
[image]
Table after some transformations and cosmetics:
[image]

Edit 2: I realized that the guideline does not require a breakdown by treatment but »… 20% of the observations …« Therefore:

[image]

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

BEQool
★

2024-09-09 07:48
(205 d 02:42 ago)

@ Helmut
Posting: # 24191
Views: 4,096

ICH M13A: Changes to Step 2

Post reply

Hello all,

❝ – Final (page 12)

❝ Randomised, non-replicate, crossover design studies should be analysed using an appropriate parametric method, e.g., general linear model (GLM) or mixed model.

❝ […] The mixed model likely was a concession made to the FDA and Health Canada, the only agencies currently requiring it.

So we can either put subject as a fixed or random effect?

But then the following text comes up:
"The primary statistical analyses should include all data for all subjects who provide evaluable data for both the test and comparator products."

And also in Q&A document:
In a 2-way crossover design, if data from one period are excluded, the subject should not be included in the statistical analysis […]

I am confused, if I understand well, we should all treat subject as a fixed effect? :confused:

Regards
BEQool

Helmut
★★★

Vienna, Austria,
2024-09-09 09:04
(205 d 01:26 ago)

@ BEQool
Posting: # 24192
Views: 4,125

fixed or mixed effects model

Post reply

Hi BEQool,

❝ I am confused, if I understand well, we should all treat subject as a fixed effect? :confused:

In a 2×2×2 crossover with subject(s) missing in one period you get (apart from a slight difference in the n^th figure, which is not relevant) the same result for a mixed effects model with all subject(s) and a fixed effects model excluding the subject(s) with missings.
If you exclude the subject(s) with missings, a mixed effects model will given you essentially the same result than a fixed effects model. Quite often the differences are at the numerical resolution of the software. Since the EMA implemented ICH M13A already (effective 25 January 2025; see this post), you could use any.

However, I would still use a mixed effects model for the FDA and Health Canada and a fixed effects model for all other agencies. We don’t want to confuse assessors who are used to what they required for ages.
There is one advantage of a mixed effects model: Apart from the within-subject subject CV you get additionally the between-subject subject CV. At least nice to know.

A simulation (mixed effects models in [image]

are a beast). I give the point estimate and confidence interval in percent with five decimal places for comparison. If rounded to two decimal places, differences for the balanced and imbalanced cases will disappear (note that ICH M13A is silent about rounding).

CV : 0.225 T/R-ratio : 0.95 lower limit : 0.8000 upper limit : 1.2500 power : at least 0.8 alpha : 0.0500 n : 24 (complete data, balanced sequences) dropout(s) : 1 (both periods) missing(s) : 1 (second period) data package function subject df method PE lower upper width balanced stats lm fixed 22.0 Residual 93.71542 85.34248 102.90983 17.56735 balanced nlme lme random 22.0 Residual 93.71542 85.34247 102.90983 17.56736 balanced lmerTest lmer random 22.0 Satterthwaite 93.71542 85.34248 102.90983 17.56735 balanced lmerTest lmer random 22.0 Kenward-Roger 93.71542 85.34248 102.90983 17.56735 imbalanced stats lm fixed 21.0 Residual 92.96790 84.36634 102.44643 18.08008 imbalanced nlme lme random 21.0 Residual 92.96790 84.36634 102.44644 18.08010 imbalanced lmerTest lmer random 21.0 Satterthwaite 92.96790 84.36634 102.44643 18.08008 imbalanced lmerTest lmer random 21.0 Kenward-Roger 92.96790 84.36634 102.44643 18.08008 incomplete stats lm fixed 20.0 Residual 92.19586 83.31751 102.02028 18.70277 incomplete nlme lme random 20.0 Residual 92.03122 83.19633 101.80431 18.60798 incomplete lmerTest lmer random 20.2 Satterthwaite 92.03122 83.20136 101.79815 18.59679 incomplete lmerTest lmer random 20.1 Kenward-Roger 92.03122 83.19399 101.80718 18.61319 balanced sequences (nRT = nRT) imbalanced sequences (one subject missing) incomplete data (as above and the second period of the last subject missing) df : degrees of freedom = n – 2 for Residual, approximated by Satterthwaite and Kenward-Roger method: df method applied PE : Point Estimate lower : lower limit of the 90% CI upper : upper limit of the 90% CI

For both balanced and imbalanced data the point estimate is identical regardless the model applied.
For balanced data we get practically the same results for the fixed and mixed effects model (slight difference in the fifth decimal place of the lower confidence limit).
For imbalanced data, there are slight differences in the fifth decimal place of the upper confidence limit.
There are differences for incomplete data (period missing) – not only for the CI but also the PE. However, in such a case the subject should be excluded according to M13A and its Q&A anyway.

For SASians and users of Statistica: Residual (in R and Phoenix/WinNonlin) is equivalent to DDFM = CONTAIN and containment, respectively.

Lengthy

-script for simulating a 2×2×2 crossover. Defaults: alpha 0.05 (90% CI), BE-limits 0.8–1.25, T/R-ratio 0.95. power ≥ 0.8. Optionally specification of period- and/or carryover-effects.

library(PowerTOST) # for sample size estimation library(nlme) # for mixed effects model by lm() suppressMessages(library(lmerTest)) # requires also the packages lme4 and Matrix # for mixed effects model by lmer() of lme4, # Satterwaite and Kenward-Roger df sim.study <- function(alpha = 0.05, theta0, theta1, theta2, targetpower, CV, n, CVb, per.effect = c(0, 0), carryover = c(0, 0), setseed = TRUE) { # simulate subject data if (setseed) set.seed(1234569) if (missing(theta0)) theta0 <- 0.95 if (missing(theta1) & missing(theta2)) theta1 <- 0.80 if (missing(theta1)) theta1 <- 1 / theta2 if (missing(theta2)) theta2 <- 1 / theta1 if (missing(targetpower)) targetpower <- 0.80 if (missing(CV)) stop("CV must be given.") if (missing(n)) { n <- sampleN.TOST(alpha = alpha, CV = CV, theta0 = theta0, theta1 = theta1, theta2 = theta2, targetpower = targetpower, print = FALSE)[["Sample size"]] } CVb <- CV * 1.5 # arbitrary (cosmetics) sd <- CV2se(CV) sd.b <- CV2se(CVb) # within subjects T <- rnorm(n = n, mean = log(theta0), sd = sd) R <- rnorm(n = n, mean = 0, sd = sd) # between subjects between <- rnorm(n = n, mean = 0, sd = sd.b) T <- T + between R <- R + between data <- data.frame(subject = rep(1:n, each = 2), period = 1:2L, sequence = c(rep("RT", n), rep("TR", n)), treatment = c(rep(c("R", "T"), n / 2), rep(c("T", "R"), n / 2)), logPK = NA_real_) subj.T <- subj.R <- 0L # subject counters for (j in 1:nrow(data)) { # clumsy but transparent if (data$treatment[j] == "T") { subj.T <- subj.T + 1L if (data$period[j] == 1L) { data$logPK[j] <- T[subj.T] + per.effect[1] } else { data$logPK[j] <- T[subj.T] + per.effect[2] + carryover[1] } } else { subj.R <- subj.R + 1L if (data$period[j] == 1L) { data$logPK[j] <- R[subj.R] + per.effect[1] } else { data$logPK[j] <- R[subj.T] + per.effect[2] + carryover[2] } } } facs <- c("subject", "period", "sequence", "treatment") data[facs] <- lapply(data[facs], factor) return(data) } # eof sim.study() fixed.mixed <- function(data) { # analyse data by fixed and three mixed effects models od <- options("digits") oc <- options("contrasts") options(digits = 12) options(contrasts = c("contr.treatment", "contr.poly")) on.exit(od) on.exit(oc) res <- data.frame(package = c("stats", "nlme", rep("lmerTest", 2)), fun = c("lm", "lme", rep("lmer", 2)), subjects = c("fixed", rep("random", 3)), df = NA_real_, method = c(rep("Residual", 2), "Satterthwaite", "Kenward-Roger"), PE = NA_real_, lower = NA_real_, upper = NA_real_) model <- c("lm", "lme", "lmer.Satt", "lmer.kr") # for switching for (j in 1:4L) { switch (j, lm = { # all effects fixed: evaluation by lm() of package stats / ANOVA # bogus nested model acc. to the GL m <- lm(logPK ~ sequence + subject %in% sequence + period + treatment, na.action = na.omit, data = data) PE <- 100 * exp(coef(m)[["treatmentT"]]) ci <- 100 * exp(confint(m, "treatmentT", level = 1 - 2 * alpha)) df <- anova(m)["Residuals", "Df"] res[j, c(4, 6:8)] <- c(df, PE, ci) }, lme = { # subjects random: evaluation by lm() of package nlme m <- lme(logPK ~ sequence + period + treatment, random = ~ 1 | subject, na.action = na.omit, data = data) s <- summary(m) PE <- 100 * exp(s$tTable["treatmentT", "Value"]) ci <- 100 * exp(intervals(m, which = "fixed", level= 1 - 2 * alpha)[[1]]["treatmentT", c(1, 3)]) df <- s$tTable["treatmentT", "DF"] res[j, c(4, 6:8)] <- c(df, PE, ci) }, lmer.Satt = { # subjects random: evaluation by lmer() of package lme4 # Satterthwaite df by package lmerTest m <- lmer(logPK ~ sequence + period + treatment + (1 | subject), na.action = na.omit, data = data) s <- summary(m, ddf = "Satterthwaite") PE <- s$coefficients["treatmentT", "Estimate"] ci <- 100 * exp(PE + c(-1, +1) * qt(1 - alpha, s$coef["treatmentT", "df"]) * s$coef["treatmentT", "Std. Error"]) PE <- 100 * exp(PE) df <- s$coefficients["treatmentT", "df"] res[j, c(4, 6:8)] <- c(df, PE, ci) }, lmer.kr = { # subjects random: evaluation by lmer() of package lme4 # Kenward-Roger df by package lmerTest m <- lmer(logPK ~ sequence + period + treatment + (1 | subject), na.action = na.omit, data = data) s <- summary(m, ddf = "Kenward-Roger") PE <- s$coefficients["treatmentT", "Estimate"] ci <- 100 * exp(PE + c(-1, +1) * qt(1 - alpha, s$coef["treatmentT", "df"]) * s$coef["treatmentT", "Std. Error"]) PE <- 100 * exp(PE) df <- s$coefficients["treatmentT", "df"] res[j, c(4, 6:8)] <- c(df, PE, ci) } ) } return(result = res) } # eof fixed.mixed() #################### # simulation setup # #################### CV <- 0.225 theta0 <- 0.95 theta1 <- 0.80 theta2 <- 1.25 target <- 0.80 alpha <- 0.05 per.effect <- 0 carryover <- c(0, 0) dropouts <- 1 missings <- 1 n <- sampleN.TOST(alpha = alpha, CV = CV, theta0 = theta0, theta1 = theta1, theta2 = theta2, targetpower = target, print = FALSE)[["Sample size"]] per.effect <- c(0, per.effect) # simulate balanced (nRT = nTR) and complete (nR = nT) data data0 <- sim.study(alpha = alpha, CV = CV, theta0 = theta0, theta1 = theta1, targetpower = target, n = n, per.effect = per.effect, carryover = carryover) tmp0 <- fixed.mixed(data0) # remove droput(s): gives imbalanced sequences remove <- tail(unique(data0$subject), dropouts) if (length(remove) >= 1) { data1 <- data0[!data0$subject == remove, ] } else { data1 <- data0 } tmp1 <- fixed.mixed(data1) # set logPK in 2nd period of last subject(s) to NA: gives incomplete data data2 <- data1 remove <- tail(unique(data2$subject), missings) if (length(remove) >= 1) { data2$logPK[data2$subject == remove & data2$period == 2] <- NA } tmp2 <- fixed.mixed(data2) # aggregate results res <- data.frame(data = c(rep("balanced", 4), rep("imbalanced", 4), rep("incomplete", 4)), package = "", FUN = "", subject = "", df = NA_real_, method = "", PE = NA_real_, lower = NA_real_, upper = NA_real_, width = NA_real_) res[1:4, 2:9] <- tmp0 # balanced, complete res[5:8, 2:9] <- tmp1 # imbalanced, complete res[9:12, 2:9] <- tmp2 # imbalanced, incomplete res[, 5] <- round(res[, 5], 1) res[, 10] <- res[, 9] - res[, 8] res[, 7:10] <- round(res[, 7:10], 5) names(res)[3] <- "function" t1 <- paste("CV :", sprintf("%.4g", CV), "\nT/R-ratio :", sprintf("%.4g", theta0), "\nlower limit :", sprintf("%.4f", theta1), "\nupper limit :", sprintf("%.4f", theta2), "\npower : at least", sprintf("%.4g", target), "\nalpha :", sprintf("%.4f", alpha), "\nn :", n, "(complete data, balanced sequences)") if (dropouts > 0) t1 <- paste(t1, "\ndropout(s) :", sprintf("%2i", dropouts), "(both periods)") if (missings > 0) t1 <- paste(t1, "\nmissing(s) :", sprintf("%2i", missings), "(second period)") if (!per.effect[2] == 0) t1 <- paste(t1, "\nperiod effect:", per.effect[2]) if (!carryover[1] == 0 & !carryover[2] == 0) t1 <- paste(t1, "\ncarryover :", paste(carryover, collapse = ", ")) t1 <- paste(t1, "\n\n") t2 <- paste("\nbalanced sequences (nRT = nRT)", "\nimbalanced sequences (one subject missing)", "\nincomplete data (as above and the second period", "of the last subject missing)", "\ndf : degrees of freedom =", "\n n – 2 for Residual,", "approximated by Satterthwaite and Kenward-Roger", "\nmethod: df method applied", "\nPE : Point Estimate", "\nlower :", sprintf("lower limit of the %.4g%% CI", 100*(1-2*alpha)), "\nupper :", sprintf("upper limit of the %.4g%% CI\n", 100*(1-2*alpha))) cat(t1); print(res, row.names = FALSE, right = FALSE); cat(t2)

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

BEQool
★

2024-09-09 12:18
(204 d 22:13 ago)

@ Helmut
Posting: # 24193
Views: 4,020

fixed or mixed effects model

Post reply

Hello Helmut,

thank you for a well-explained answer with specific examples.

❝ In a 2×2×2 crossover with subject(s) missing one period you get (apart from a slight difference in the n^th figure, which is not relevant) the same result for a mixed effects model with all subject(s) and a fixed effects model excluding the subject(s) with missings.

Doesnt a model with subject as a fixed effect automatically exlude subjects with missing periods (in 2x2x2 design) from the analysis? Or am I wrong? :-D

If I am not wrong and if I understand correctly, based on your examples with incomplete data (periods missing), the difference between subject as a fixed effect and subject as a mixed effect is more than a slight difference (instead the difference is observed in the 1st decimal place)?

❝ However, I would still use a mixed effects model for the FDA and Health Canada and a fixed effects model for all other agencies. We don’t want to confuse assessors who are used to what they required for ages.

I agree

❝ There are differences for incomplete data (period missing) – not only for the CI but also the PE. However, in such a case the subject should be excluded according to M13A and its Q&A anyway.

Yes I was most interested in these incomplete data. So basically if you have incomplete data set (just periods missing) you should exclude the subject from the analysis and then the result (PI and 90%CI) would be almost the same (regardless of subject as a fixed or mixed effect)? Furthermore, this result would again be almost the same as if we wouldnt exclude the subject from the analysis and just used subject as a fixed effect?

Helmut
★★★

Vienna, Austria,
2024-09-09 13:13
(204 d 21:17 ago)

@ BEQool
Posting: # 24194
Views: 4,007

fixed or mixed effects model

Post reply

Hi BEQool,

❝ thank you for a well-explained answer with specific examples.

Welcome. No big deal because I had the simulation-script already.

❝ Doesnt a model with subject as a fixed effect automatically exlude subjects with missing periods (in 2x2x2 design) from the analysis? Or am I wrong? :-D

Of course, you are right.

❝ If I am not wrong and if I understand correctly, based on your examples with incomplete data (periods missing), the difference between subject as a fixed effect and subject as a mixed effect is more than a slight difference (instead the difference is observed in the 1st decimal place)?

Yes. That’s why I wrote:

❝ ❝ There are differences for incomplete data (period missing) – not only for the CI but also the PE.

Note also that the width of CI by the mixed effect models is generally (a little bit) narrower than the one by the fixed effect model because some information is recovered. Peanuts.

❝ Yes I was most interested in these incomplete data. So basically if you have incomplete data set (just periods missing) you should exclude the subject from the analysis and then the result (PI and 90%CI) would be almost the same (regardless of subject as a fixed or mixed effect)? Furthermore, this result would again be almost the same as if we wouldnt exclude the subject from the analysis and just used subject as a fixed effect?

Yep.

P.S.: I modified the script in such a way that you can specify the number of dropouts and subjects with missings. Try this:

CV <- 0.15 theta0 <- 0.975 target <- 0.90 alpha <- 0.05 per.effect <- 0 carryover <- c(0, 0) dropouts <- 2 missings <- 1

Interesting, isn’t it?

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

BEQool
★

2024-09-10 09:35
(204 d 00:56 ago)

@ Helmut
Posting: # 24196
Views: 3,916

fixed or mixed effects model

Post reply

Hello Helmut,

thank you for the answers.
I just wanted to point out that, if I understand correctly, in ICH M13A basically nothing changes (PE and 90%CI) compared to EMA guideline on how to perform an analysis when periods are missing (because subject is a fixed factor) but compared to FDA guideline (subject as a random factor) PE and 90% CI change a little because now subjects with missing periods have to be excluded from the analysis (and according to FDA guideline you shouldnt exclude them)?

❝ P.S.: I modified the script in such a way that you can specify the number of dropouts and subjects with missings. Try this:

❝ CV <- 0.15

❝ theta0 <- 0.975

❝ target <- 0.90

❝ alpha <- 0.05

❝ per.effect <- 0

❝ carryover <- c(0, 0)

❝ dropouts <- 2

❝ missings <- 1

❝ Interesting, isn’t it?

Thanks, interesting indeed :ok:

Regards
BEQool

Helmut
★★★

Vienna, Austria,
2024-09-10 10:12
(204 d 00:18 ago)

@ Helmut
Posting: # 24197
Views: 3,958

fixed or mixed effects model

Post reply

Hi BEQool,

partly updated script (change below # aggregate results). Then:

dropouts <- 0 missings <- 0 CV : 0.225 T/R-ratio : 0.95 lower limit : 0.8000 upper limit : 1.2500 power : at least 0.8 alpha : 0.0500 n : 24 data package function subject df method PE lower upper width balanced stats lm fixed 22 Residual 93.71542 85.34248 102.90983 17.56735 balanced nlme lme random 22 Residual 93.71542 85.34247 102.90983 17.56736 balanced lmerTest lmer random 22 Satterthwaite 93.71542 85.34248 102.90983 17.56735 balanced lmerTest lmer random 22 Kenward-Roger 93.71542 85.34248 102.90983 17.56735 df : degrees of freedom = n – 2 for Residual, approximated by Satterthwaite and Kenward-Roger method: df method applied PE : Point Estimate lower : lower limit of the 90% CI upper : upper limit of the 90% CI dropouts <- 1 missings <- 0 CV : 0.225 T/R-ratio : 0.95 lower limit : 0.8000 upper limit : 1.2500 power : at least 0.8 alpha : 0.0500 n : 24 dropout(s) : 1 (both periods) data package function subject df method PE lower upper width imbalanced stats lm fixed 21 Residual 92.9679 84.36634 102.44643 18.08008 imbalanced nlme lme random 21 Residual 92.9679 84.36634 102.44644 18.08010 imbalanced lmerTest lmer random 21 Satterthwaite 92.9679 84.36634 102.44643 18.08008 imbalanced lmerTest lmer random 21 Kenward-Roger 92.9679 84.36634 102.44643 18.08008 df : degrees of freedom = n – 2 for Residual, approximated by Satterthwaite and Kenward-Roger method: df method applied PE : Point Estimate lower : lower limit of the 90% CI upper : upper limit of the 90% CI dropouts <- 1 missings <- 1 CV : 0.225 T/R-ratio : 0.95 lower limit : 0.8000 upper limit : 1.2500 power : at least 0.8 alpha : 0.0500 n : 24 dropout(s) : 1 (both periods) missing(s) : 1 (second period) data package function subject df method PE lower upper width imbal, incompl stats lm fixed 20.0 Residual 92.19586 83.31751 102.02028 18.70277 imbal, incompl nlme lme random 20.0 Residual 92.03122 83.19633 101.80431 18.60798 imbal, incompl lmerTest lmer random 20.2 Satterthwaite 92.03122 83.20136 101.79815 18.59679 imbal, incompl lmerTest lmer random 20.1 Kenward-Roger 92.03122 83.19399 101.80718 18.61319 df : degrees of freedom = n – 2 for Residual, approximated by Satterthwaite and Kenward-Roger method: df method applied PE : Point Estimate lower : lower limit of the 90% CI upper : upper limit of the 90% CI

if (dropouts == 0 & missings == 0) { res <- data.frame(data = rep("balanced", 4), package = "", FUN = "", subject = "", df = NA_real_, method = "", PE = NA_real_, lower = NA_real_, upper = NA_real_, width = NA_real_) res[1:4, 2:9] <- tmp0 } if (dropouts > 0 & missings == 0) { res <- data.frame(data = rep("imbalanced", 4), package = "", FUN = "", subject = "", df = NA_real_, method = "", PE = NA_real_, lower = NA_real_, upper = NA_real_, width = NA_real_) res[1:4, 2:9] <- tmp1 } if (dropouts > 0 & missings > 0) { res <- data.frame(data = rep("imbal, incompl", 4), package = "", FUN = "", subject = "", df = NA_real_, method = "", PE = NA_real_, lower = NA_real_, upper = NA_real_, width = NA_real_) res[1:4, 2:9] <- tmp2 } TODO: dropouts == 0 & missings > 0 res[, 5] <- round(res[, 5], 1) res[, 10] <- res[, 9] - res[, 8] res[, 7:10] <- round(res[, 7:10], 5) names(res)[3] <- "function" t1 <- paste("CV :", sprintf("%.4g", CV), "\nT/R-ratio :", sprintf("%.4g", theta0), "\nlower limit :", sprintf("%.4f", theta1), "\nupper limit :", sprintf("%.4f", theta2), "\npower : at least", sprintf("%.4g", target), "\nalpha :", sprintf("%.4f", alpha), "\nn :", n) if (dropouts > 0) t1 <- paste(t1, "\ndropout(s) :", sprintf("%2i", dropouts), "(both periods)") if (missings > 0) t1 <- paste(t1, "\nmissing(s) :", sprintf("%2i", missings), "(second period)") if (!per.effect[2] == 0) t1 <- paste(t1, "\nperiod effect:", per.effect[2]) if (!carryover[1] == 0 & !carryover[2] == 0) t1 <- paste(t1, "\ncarryover :", paste(carryover, collapse = ", ")) t1 <- paste(t1, "\n\n") t2 <- paste("\ndf : degrees of freedom =", "\n n – 2 for Residual,", "approximated by Satterthwaite and Kenward-Roger", "\nmethod: df method applied", "\nPE : Point Estimate", "\nlower :", sprintf("lower limit of the %.4g%% CI", 100*(1-2*alpha)), "\nupper :", sprintf("upper limit of the %.4g%% CI\n", 100*(1-2*alpha))) cat(t1); print(res, row.names = FALSE, right = FALSE); cat(t2)

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2024-08-05 14:53
(239 d 19:38 ago)

@ Helmut
Posting: # 24132
Views: 5,890

ICH M13A: Changes to Step 2

Post reply

Dear all,

the story continues.

Draft (2.2.4 Bioequivalence Criteria, page 17)
For the majority of drug products, the PK parameters to demonstrate BE include C_max and AUC_(0–t).
For drugs with a long elimination half-life, AUC_(0–72h) may be used in place of AUC_(0–t) (see Section 2.1.8.2). For drugs where it is clinically relevant to assess the early exposure or early onset of action, an additional PK parameter, pAUC, may be used to establish BE (see Section 2.1.8.3).
The 90% confidence interval for the geometric mean ratio of these PK parameters used to establish BE should lie within a range of 80.00 – 125.00%.
Final (page 13)
For the majority of drug products, the PK parameters to demonstrate BE include C_max and AUC_(0–t) in single-dose studies and C_maxSS and AUC_(0–tauSS) in multiple-dose studies.
For drugs with a long elimination half-life, AUC_(0–72h) may be used as AUC_(0–t) (see Section 2.1.8.2).
The 90% confidence interval for the geometric mean ratio of these PK parameters used to establish BE should lie within a range of 80.00 – 125.00%.
For drugs where it is clinically relevant to assess the early exposure or early onset of action, an additional PK parameter should be used to establish BE (see Section 2.1.8.3).

Reordered and expanded for multiple dose studies.
Why the weird phrase »AUC_(0–72h) may be used as AUC_(0–t)« and not keep what was stated in the draft or simply “AUC_(0–72h) may be used instead of AUC_(0–t)”?
Contrary to the guidances of the FDA and Health Canada, nothing is stated about rounding of the confidence interval. Based on a clinically relevant difference $\small{\Delta=20\%}$ we get the BE-limits in the multiplicative model by $$\small{\left\{\theta_1,\theta_2\right\}=\left\{100\left(1-\Delta\right),100\left(1-\Delta\right)^{-1}\right\}=\left\{80\%,125\%\right\}}$$ These limits are exact, i.e., a study with a CI of 79.995 –125.005% would fail. Since the limits are stated as 80.00 – 125.00%, does that imply that we have to round to two decimal places? Then the same study would pass* and – slightly – inflate the Type I Error.

Draft (2.2.5.1 Multiple Comparator Products, page 17)
It may be necessary to demonstrate BE between a test product and multiple comparator products to meet requirements from multiple jurisdictions. In such case, including comparator products from different regions in one trial is acceptable to streamline the BE demonstration by conducting one single higher-order crossover BE study with multiple comparator products.
Although there are multiple comparator products tested, multiplicity correction, i.e., alpha adjustment, is not needed because comparator products are considered independent and region-specific. Decisions will be made independently about a test product relative to a single comparator product within a single jurisdiction. It is preferred for the statistical analysis to only test two at a time and not all at once, making pairwise comparison within the analysis.
Final (page 13–14)
It may be necessary to demonstrate BE between a test product and multiple comparator products to meet requirements from multiple jurisdictions. Including comparator products from different regions in one trial is acceptable to streamline the BE demonstration by conducting one single higher-order crossover BE study with multiple comparator products.
In studies with multiple comparator products, multiplicity correction, i.e., alpha adjustment, is not needed because comparator products are considered independent and region-specific. Decisions will be made independently about a test product relative to a single comparator product within a single jurisdiction. ~~It is preferred for the statistical analysis to only test two at a time and not all at once, making pairwise comparison within the analysis.~~

Why on earth was the important last sentence removed‽ Without, I expect that applicants will use of the crappy ‘All at Once’ approach (i.e., analysis of pooled data).^1,2 I got many studies with the wrong method on my desk, although the EMA stated already in its 2010 guideline:

In studies with more than two treatment arms (e.g. a three period study including two references, one from EU and another from USA […]), the analysis for each comparison should be conducted excluding the data from the treatments that are not relevant for the comparison in question.

Very similar recently the FDA:^3,4

In BE studies with more than two reference treatment arms (e.g., a three-period study including two references, one from the European Union (EU) and another from the United States […]), the BE determination should be based on the comparison between the relevant test and reference products, using only the data from those products. The BE analysis for this comparison should be conducted excluding the data from the non-relevant treatment(s) — for example, in a BE study with a T product, an EU reference product, and a U.S. reference product, the comparison of the T product to the U.S. reference product should be based on an analysis excluding the data from the EU reference.

For details see this article, a presentation by David Brown (MHRA),⁵ and the Q&A.⁶

Draft (3.1 Endogenous Compounds, page 17–18)
Alternatively, the need for baseline correction may be avoided by enrolling study subjects with low or no production of the endogenous compounds.
Final (page 14–15)
When considered necessary to ensure adequate separation of treatment-induced concentrations over baseline, a high dose may be administered in BE studies of endogenous compounds if the dose is well tolerated and dose proportionality in PK is maintained. Alternatively, the need for baseline correction may be avoided by enrolling study subjects with low or no production of the endogenous compounds.

Only slight rewording in all but the last paragraph.

Draft (3.2.3 Oral Suspensions, page 20)
For new intended label use/instructions (not included in the comparator product labelling), the test product should be administered according to its intended labelling and compared with the comparator product administered as per its labelling.
Final (page 16)
For new intended label use/instructions, e.g., oral suspensions as an extension to another orally administered IR drug product, BE studies may be conducted to determine whether the oral suspension is BE to the comparator product. In this scenario, the oral suspension product should be administered according to its intended labelling and compared with the comparator product administered as per its labelling

To be continued.

By correct (IEEE 754) rounding. Phoenix WinNonlin and bloody Excel use commercial rounding, which is wrong, i.e. 125.005 → 125.01.
c.round <- function(x, y) { # commercial rounding (wrong) sign(x) * trunc(abs(x) * 10^y + 0.5) / 10^y } limits <- c(80, 125) exact <- c(79.995, 125.005) t <- sprintf("exact CI: %.3f – %.3f\n", exact[1], exact[2]) HC <- sprintf("%.1f", exact) HC.c <- sprintf("%.1f", c.round(exact, 1)) FDA <- sprintf("%.2f", exact) FDA.c <- sprintf("%.2f", c.round(exact, 2)) res <- data.frame(rounding = c("none", rep(c("IEEE 754", "commercial"), 2)), result = c("", rep("HC", 2), rep("FDA", 2)), L = c(sprintf("%.3f", exact[1]), HC[1], HC.c[1], FDA[1], FDA.c[1]), U = c(sprintf("%.3f", exact[2]), HC[2], HC.c[2], FDA[2], FDA.c[2]), BE = "fail") for (j in 1:nrow(res)) { if (as.numeric(res$L[j]) >= limits[1] & as.numeric(res$U[j]) <= limits[2]) res$BE[j] <- "pass" } cat(t); print(res, row.names = FALSE, right = FALSE) exact CI: 79.995 – 125.005 rounding result L U BE none 79.995 125.005 fail IEEE 754 HC 80.0 125.0 pass commercial HC 80.0 125.0 pass IEEE 754 FDA 80.00 125.00 pass commercial FDA 80.00 125.01 fail

Schuirmann DJ. Two at a Time? Or All at Once? International Biometric Society, Eastern North American Region, Spring Meeting. Pittsburgh, PA. March 28–31, 2004. Abstract.
D’Angelo P. Testing for Bioequivalence in Higher‐Order Crossover Designs: Two‐at‐a‐Time Principle Versus Pooled ANOVA. 2^nd Conference of the Global Bioequivalence Harmonisation Initiative. Rockville, MD. 15–16 September, 2016.
FDA (CDER). Draft Guidance for Industry. Statistical Approaches to Establishing Bioequivalence. Silver Spring. December 2022. Revision 1. Download.
FDA (CDER, OGD). Navigating the First ICH Generic Drug Draft Guideline “M13A Bioequivalence for Immediate-Release Solid Oral Dosage Forms”. SBIA Webinar. May 2, 2023. Online.
Brown D. Presentation at the 3^rd EGA Symposium on Bioequivalence. London. June 2010. Slides.
European Generic Medicines Association. Revised EMA Bioequivalence Guideline. Questions & Answers. Brussels. 2010. Online.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

mittyri
★★

Russia,
2024-08-05 22:42
(239 d 11:48 ago)

@ Helmut
Posting: # 24135
Views: 5,782

period within group and formulation

Post reply

Dear Helmut,

❝ BE should be determined based on the overall treatment effect in the whole study population. The statistical model should take into account the multi-group nature of the BE study, e.g., by using a model including terms for group, sequence, sequence × group, subject within sequence × group, period within group and formulation. The group × treatment interaction term should not be included in the model. However, applicants should evaluate potential for heterogeneity of treatment effect across groups and discuss its potential impact on the study data, e.g., by investigation of group × treatment interaction in a supportive analysis and calculation of descriptive statistics by group.

"France lost the battle but she has not lost the war!"
I hope that we can see other improvements in the future.

By the way

do you know the reason to include period within group and formulation? As far as I remember, Model I and Model II included Period(Group) factor only.
What kind of descriptive statistics is expected? Everything above BE section stratified by group? What are the consequences of this evaluation of heterogeneity? Is it possible that assessor will be unhappy with the statistics of some group comparing to others? Since no tests are suggested (eyeball desc stat by group comparison?), any evidence provided post hoc are weak arguments

—
Kind regards,
Mittyri

Helmut
★★★

Vienna, Austria,
2024-08-06 00:29
(239 d 10:01 ago)

@ mittyri
Posting: # 24138
Views: 5,717

period within group and formulation

Post reply

Hi Mittyri,

❝ ❝ The statistical model should take into account the multi-group nature of the BE study, e.g., by using a model including terms for group, sequence, sequence × group, subject within sequence × group, period within group and formulation. The group × treatment interaction term should not be included in the model. However, applicants should evaluate potential for heterogeneity of treatment effect across groups and discuss its potential impact on the study data, e.g., by investigation of group × treatment interaction in a supportive analysis and calculation of descriptive statistics by group.

❝

❝ I hope that we can see other improvements in the future.

Unlikely. ICH guidelines regularly are not updated for 20+ years (e.g., E3 of 1995, E8 of 1997, E9 of 1998).

❝ – do you know the reason to include period within group and formulation? As far as I remember, Model I and Model II included Period(Group) factor only.

Models as stated by the FDA and used by the usual suspects (see this post):

Group, Sequence, Treatment, Subject(Group × Sequence), Period(Group), Group × Sequence, Group × Treatment
Group, Sequence, Treatment, Subject(Group × Sequence), Period(Group), Group × Sequence

Model II is the one of the guideline, only the order of effects is different. Doesn’t matter. Model I should not be used.

I don’t understand what is meant by »investigation of group × treatment interaction in a supportive analysis«. Assess for BE by Model II and then the $\small{G\times T}$ interaction by Model I? We will find a significant interaction in ≈ 5% of studies (i.e., at the level $\small{\alpha}$ of the test). Lengthy and fruitless discussions expected.

❝ – What kind of descriptive statistics is expected? Everything above BE section stratified by group?

I think so.

❝ What are the consequences of this evaluation of heterogeneity? Is it possible that assessor will be unhappy with the statistics of some group comparing to others? Since no tests are suggested (eyeball desc stat by group comparison?), …

Even if an assessor would calculate the confidence interval of groups separately, likely they would overlap due to the limited sample sizes. So what?

A recent example (4 period full replicate design):

[image]
The study was equivalent with Model II, though it was a close shave (PE: 116.50%, 90% CI: 109.32–124.16%)…

❝ … any evidence provided post hoc are weak arguments

Even not acceptable: Section 2.2.3.1 (page 11)

The statistical analysis should take into account sources of variation that can be reasonably assumed to have an effect on the response variable. Post hoc and data-driven adjustments are not acceptable for the primary statistical analysis.

As already in the draft (page 16).

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2024-08-08 13:57
(236 d 20:34 ago)

@ Helmut
Posting: # 24142
Views: 5,484

ICH M13A: Step 4 → 5

Post reply

Dear all,

although the ICH’s website states that M13A is in Step 5 (Implementation), this is not exactly correct. The guideline is in Step 4 (Final) and its implementation will take some time and differ between regions. There is no deadline and no obligation for an agency to implement it at all.

As an example the status of M10 Bioanalytical Method Validation and Study Sample Analysis (Step 4 on 24 May 2022) as of today:
$$\small{\begin{array}{llcll}
\phantom{000000}\textsf{Ageny} & \phantom{00000}\textsf{Region} & \textsf{Step} & \phantom{0000000}\textsf{Status}& \phantom{000000i}\Delta_\text{t}\\\hline
\textbf{ANVISA} & \text{Brazil} & (\color{Red}4\color{Black}\to \color{Green}5\color{Black}) & \text{Implementation process} & \phantom{0000000}-\\
\textbf{COFEPRIS} & \text{Mexico} & (\color{Red}4\color{Black}\to \color{Green}5\color{Black}) &\text{Not yet implemented} & \phantom{0000000}-\\
\textbf{EC} & \text{Europe} & \color{Green}5 &\text{21 January 2023} & \text{8 months, 29 days}\\
\textbf{EDA} & \text{Egypt} & (\color{Red}4\color{Black}\to \color{Green}5\color{Black}) &\text{Implementation process} & \phantom{0000000}-\\
\textbf{FDA} & \text{United States} & \color{Green}5 &\text{7 November 2022} & \text{6 months, 15 days}\\
\textbf{HSA} & \text{Singapore} & (\color{Red}4\color{Black}\to \color{Green}5\color{Black}) & \text{Implementation process} & \phantom{0000000}-\\
\textbf{HC} & \text{Canada} & \color{Green}5 & \text{20 January 2023} & \text{8 months, 28 days}\\
\textbf{MFDS} & \text{Republic of Korea} & \color{Green}5 &\text{26 October 2023} & \text{1 year, 6 months, 3 days}\\
\textbf{MHLW/PMDA} & \text{Japan} & (\color{Red}4\color{Black}\to \color{Green}5\color{Black}) & \text{Not yet implemented} & \phantom{0000000}-\\
\textbf{MHRA} & \text{UK} & (\color{Red}4\color{Black}\to \color{Green}5\color{Black}) & \text{Implementation process} & \phantom{0000000}-\\
\textbf{NMPA} & \text{China} & \color{Green}5 & \text{29 July 2023} & \text{1 year, 3 months, 3 days}\\
\textbf{SFDA} & \text{Saudi Arabia} & (\color{Red}4\color{Black}\to \color{Green}5\color{Black}) & \text{Not yet implemented} & \phantom{0000000}-\\
\textbf{Swissmedic} & \text{Switzerland} & \color{Green}5 & \text{25 May 2022} & \text{1 month, 1 day}\\
\textbf{TFFA} & \text{Chinese Taipei} & \color{Green}5 & \text{30 May 2023} & \text{1 year, 1 month, 5 days}\\
\end{array}}$$ In seven of the 14 regions the guideline is implemented, in four the implementation process is ongoing, and in three it has not even started – after more than two years. It might even be that Mexico, Japan, and Saudia Arabia decided against implementation…

If you have nothing better to do, check the status of M9 Biopharmaceutics Classification System-based Biowaivers (Step 4 on 20 November 2019). :-D

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2024-08-09 11:45
(235 d 22:46 ago)

@ Helmut
Posting: # 24145
Views: 5,329

Formal ICH Procedure

Post reply

Dear all,

what I wrote above is wrong. According to the ICH:

[image]

Step 4: Adoption of an ICH Harmonised Guideline
Step 4 is reached when the Assembly agrees that there is sufficient consensus on the draft Guideline.
The Step 4 Final Document is adopted by the ICH Regulatory Members of the ICH Assembly as an ICH Harmonised Guideline at Step 4 of the ICH process.

Step 5: Implementation
Having reached Step 4, the harmonised Guideline moves immediately to the final step of the process that is the regulatory implementation. This step is carried out according to the same national/regional procedures that apply to other regional regulatory guidelines and requirements, in the ICH regions.
(My emphasis)

Although a guideline is in implementation, it is not necessarily implemented in all regions.

Even in regions where a guideline was implemented, it might lead into trouble. If one uses the link to the EMA’s BMV guideline of 2011, it states unambiguously that it has been superseded by ICH M10. THX! However, the FDA’s guidance of 2018 is still accessible and the M10 implemented guidance of 2022 is somewhere else. That’s a trap.

Edit: M13A was adopted by the EMA’s CHMP on 25 July 2024 and will be effective with 25 January 2025 (see this post for details).

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2024-09-06 10:04
(208 d 00:26 ago)

@ Helmut
Posting: # 24188
Views: 4,216

ICH M13A: Changes to Step 2

Post reply

Dear all,

a tricky part.

Draft (4. DOCUMENTATION, page 27)
Module 2.7.1 of the CTD should list all relevant BE studies conducted regardless of the study outcome. Full study reports should be provided for the BE study(ies) upon which the applicant relies for approval. For all other studies, synopses of the study reports (in accordance with ICH E3) are sufficient. However, complete study reports for these studies should be available upon request.
Final (page 22)
Unchanged.

In the Questions and Answers (page 16) we find:

Question
If the relevant BE studies conducted with the same formulation under the same study conditions result in different BE outcomes, what action should be taken?
Answer
M13A recommends that all relevant BE studies conducted, regardless of the study outcome, should be provided. If, for a particular formulation at a particular strength, multiple pivotal studies result in inconsistent BE conclusions, the totality of the evidence should be considered. The applicant should discuss the results and justify the BE claim. When relevant, a combined analysis of all studies may be considered as a sensitivity analysis in addition to the individual study analyses. It is not acceptable, however, to pool studies which fail to demonstrate BE without a study that passes.
If there are differences in the study conditions, e.g., sampling times, fasting or fed conditions, or method of administration, pooling is not justifiable. A different number of subjects is not considered a difference in study conditions.

The first paragraph is essentially similar (pun!) to the EMA’s guideline (4.1.8 Evaluation – Presentation of data, page 16):

If for a particular formulation at a particular strength multiple studies have been performed some of which demonstrate bioequivalence and some of which do not, the body of evidence must be considered as a whole. Only relevant studies […] need be considered. The existence of a study which demonstrates bioequivalence does not mean that those which do not can be ignored. The applicant should thoroughly discuss the results and justify the claim that bioequivalence has been demonstrated. Alternatively, when relevant, a combined analysis of all studies can be provided in addition to the individual study analyses. It is not acceptable to pool together studies which fail to demonstrate bioequivalence in the absence of a study that does.What might be considered »relevant« according to the Q&A to justify pooling of studies? See also this post for problems arising if one reports a confidence interval.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes