## WHO frequent deficiencies in protocols and reports [Regulatives / Guidelines]

Dear Ohlbe,

» The WHO has published a 7-page document listing frequent deficiencies in BE protocols, reports and practices.

»

» Helmut will be happy with at least one recommendation: no need to calculate and report post-hoc power.

Oh yes, indeed. More in detail (my comments in blue):

» I expect some others to stir some discussions here (e.g. 2-stage design)

Correct. Not only that.

» The WHO has published a 7-page document listing frequent deficiencies in BE protocols, reports and practices.

»

» Helmut will be happy with at least one recommendation: no need to calculate and report post-hoc power.

Oh yes, indeed. More in detail (my comments in blue):

**METHOD OF ADMINISTRATION**- The test and comparator products should be administered under the usual conditions of use. Therefore, it is not acceptable to administer the products under yellow monochromatic light to avoid degradation; normal lighting conditions should be employed.

That’s extremely funny. Even if the drug is susceptible to*hν*-induced degradation (say, nitrendipine), what’s the penetration-depth into the formulation? A few microns? Which amount of the total API is contained in this volume and what’s the time of exposure? Less than a minute?

I have seen such procedures in protocols though it proves not only complete absence of scientific thinking but also lacking common sense.

- The test and comparator products should be administered under the usual conditions of use. Therefore, it is not acceptable to administer the products under yellow monochromatic light to avoid degradation; normal lighting conditions should be employed.
**SAMPLING TIMES**- Sample collection after 72 hours is not necessary.

Hhm, for IR yes but for controlled release?

- The wash out period should not be excessively large compared with 5 times the largest expected half-life.

Why not? In crossover studies subjects have to be in the same*physiological*state in higher periods than in the first. Some drugs are auto-inducers or -inhibitors. The t½ of the drug of a single dose tells only half of the story. OK, what is meant by “excessively large”? Maybe the long washout was intentional and for a reason.

- Sample collection after 72 hours is not necessary.
**PHARMACOKINETIC ANALYSIS**- The protocol should indicate the software to be used for pharmacokinetic calculations as well as the trapezoidal method employed for AUC calculation.

Thank you very much! It’s high time to abandon the linear trapezoidal rule. - Any below LLOQ value(s), including those between two valid concentration values, should be reported as zero.

What the heck? - For drugs with long half-life, AUC truncated at 72 h could be used. However, in the event of a missing sample at 72 hours, that profile should be excluded. In the case of a 2×2 design, this implies the exclusion of all AUC data for that subject.

That’s bad because information is lost. What about using the*estimated*concentration at 72 hours (*i.e.*, pAUC_{0–72}available in software) or AUC_{0–t,common}?

- The protocol should indicate the software to be used for pharmacokinetic calculations as well as the trapezoidal method employed for AUC calculation.
**SAMPLE SIZE CALCULATION**

**ESTIMATION**, if you don’t mind.- The sample size calculation for a replicate design with widening of the acceptance range should be calculated as described by Tothfalusi and Endrenyi in “Sample Sizes for Designing Bioequivalence Studies for Highly Variable Drugs”. J Pharm Pharmaceut Sci 15(1) 73-84, 2012.

What else?

The conventional methods for replicate designs do not take into account the impact of acceptance range widening on the sample size calculation.

Oops, that’s a “frequent deficiency”? Oh, dear! Don’t forget the upper cap of scaling and the point estimate restriction. Use`PowerTOST`

’s function`sampleN.scABEL()`

. Or simulate a couple of hours in SAS or MatLab.

- The sample size calculation for 2×2 cross-over designs or parallel designs are often not justified adequately or presented with sufficient detail. […] the expected inter-subject variability for parallel designs or intra-subject variability for the cross-over designs.

Picky: Should read “total variability for parallel designs”. The inter-subject variability is not accessible in a previous study in a parallel design.

- Sample size is sometimes calculated to detect a difference between treatments instead of being based on a calculation aimed to show equivalence.

Unbelievable.

- In case of a parallel design, it is extremely important to obtain balanced groups in all demographic characteristics that might impact the pharmacokinetics of the drugs. The methods employed to ensure balanced groups are generally not described in the protocols.

Cannot agree more. Don’t forget to pheno-/genotype subjects for drugs showing polymorphic metabolism and include extensive metabolizers only.

- The sample size calculation for a replicate design with widening of the acceptance range should be calculated as described by Tothfalusi and Endrenyi in “Sample Sizes for Designing Bioequivalence Studies for Highly Variable Drugs”. J Pharm Pharmaceut Sci 15(1) 73-84, 2012.
**STATISTICAL ANALYSIS**- Due to the statistical complexity of the alpha level expenditure in two-stage bioequivalence cross-over trials, two stage designs are not encouraged and, if used, the design should be as simple as possible e.g., with equal sizes in both stages.

Sorry to say, but that’s just crap. Generally sponsors design TSDs in such a way that the chance to demonstrate BE already in the first stage is reasonably high and the – optional – second stage acts as a kind of “safety net”. That also implies that*n*_{1}>*n*_{2}. Maybe they had a classical group-sequential design with fixed total sample size and one interim at*N*/2 in mind. Boring.

The Applicant should demonstrate that the consumer risk is not inflated above 5% with the proposed design and alpha expenditure rule, taking into account that simulations are not considered sufficiently robust and analytical solutions are preferred.

Fine. An exact method exists for 2×2 crossovers (the ‘Inverse Normal Combination Method’ with either the ‘Standard Combination Test’ or the ‘Maximum Combination Test’). For parallel designs we have only simulation-based methods. Replicate designs? Science fiction.

- The statistical procedure should be conducted without imputing values to the missing observations.

Why not?

- In those cases where the subjects are recruited and treated in groups, it is appropriate to investigate the statistical significance of the group-by-formulation interaction e.g., with the following ANOVA model: Group, Sequence, Formulation, Period (nested within Group), Group-by-Sequence interaction, Subject (nested within Group*Sequence) and Group-by-Formulation interaction. […]

That’s copypasted from a yellowed FDA deficiency-letter. Complete nonsense and – like any pretest – inflates the Type I Error.

- The
*a posteriori*power of the study does not need to be calculated.

I would prefer: “… must not be calculated.”

- It is not necessary to calculate the non-parametric 90% CI of T
_{max}. A numerical comparison of the median values and its range is considered sufficient.

Using the capital letter T (SI for the absolute temperature in Kelvin) is forgiven. However, a numerical comparison of the*range*of t_{max}is crap.

- Due to the statistical complexity of the alpha level expenditure in two-stage bioequivalence cross-over trials, two stage designs are not encouraged and, if used, the design should be as simple as possible e.g., with equal sizes in both stages.
**EXCLUSION OF DATA**- In order to exclude the pharmacokinetic results of those subjects who vomit during the study, the protocol should define in hours the value of two times median T
*max*(as documented in the literature)…

I would add for controlled release: Within the entire intended dosing interval.

- In order to exclude the pharmacokinetic results of those subjects who vomit during the study, the protocol should define in hours the value of two times median T

» I expect some others to stir some discussions here (e.g. 2-stage design)

Correct. Not only that.

—

Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮

Science Quotes

*Dif-tor heh smusma*🖖Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮

Science Quotes

### Complete thread:

- WHO frequent deficiencies in protocols and reports Ohlbe 2020-12-04 12:04 [Regulatives / Guidelines]
- WHO frequent deficiencies in protocols and reports ElMaestro 2020-12-04 14:10
- CI inclusion ≠ TOST Helmut 2020-12-04 20:30
- CI inclusion ≠ TOST d_labes 2020-12-05 16:19
- CI inclusion ≠ TOST ElMaestro 2020-12-05 16:49
- CI inclusion ≠ TOST d_labes 2020-12-07 11:39

- CI inclusion ≠ TOST Helmut 2020-12-06 22:09
- CI inclusion operationally identical to TOST d_labes 2020-12-07 11:11
- WHO lamenting about terminology? Helmut 2020-12-07 13:02
- WHO lamenting about terminology? ElMaestro 2020-12-07 14:54
- WHO lamenting about terminology? d_labes 2020-12-07 15:35

- WHO lamenting about terminology? Helmut 2020-12-07 13:02

- CI inclusion operationally identical to TOST d_labes 2020-12-07 11:11

- CI inclusion ≠ TOST ElMaestro 2020-12-05 16:49

- CI inclusion ≠ TOST d_labes 2020-12-05 16:19

- CI inclusion ≠ TOST Helmut 2020-12-04 20:30
- WHO frequent deficiencies in protocols and reportsHelmut 2020-12-22 19:00
- "obtain balanced groups" vs randomisation ElMaestro 2020-12-23 08:12

- WHO frequent deficiencies in protocols and reports ElMaestro 2020-12-04 14:10