Bioequivalence and Bioavailability Forum

Martynkf
☆

Budapest, Hungary,
2026-05-12 15:23
(22 d 11:23 ago)

Posting: # 24616
Views: 1,208

Sample size with theta0 < 0.95: any regulatory pushback? [Regulatives / Guidelines]

Dear all,

Long time lurker, first time questioner here! Hi everybody!

A regulatory-reception question rather than a statistical one, but I hope it would still be welcome.

The convention is to drive the BE sample size with an assumed T/R-ratio of 0.95 (lower side, since 1/1.05 = 0.9524 ofc). My read of ICH M13A is that no particular value of theta0 is prescribed, only that the assumption be justified.

The adjacent thread already covers most of the statistical ground, with ElMaestro's view that one simply plugs in the GMR considered realistic plus a worst-case buffer, and d_labes' observation that he has never met a sponsor-desired sample size which failed to find a 'scientific' justification (Armani suit optional). Helmut's own articles default to theta0 = 0.90 for RSABE/ABEL without apology (reminiscent of the two Lászlós recommendations). Among practitioners the matter seems quite settled.

The question is the assessors' side. Has anyone here run into pushback from EU agencies — EMA via DCP/MRP, or national authorities — when an ABE protocol used an assumed deviation larger than 5%, say theta0 = 0.90, outside the HVD/NTID setting? Statistically that is the conservative direction (n inflated, power preserved against a less flattering outcome), and a deficiency on principle would be hard to justify. But principle and practice may differ.

The context, and the reason I am asking: some Central European authorities I deal with routinely question the sample size assumptions in ways that suggest they are overly strict in analysing the a priori assumptions post hoc (regardless of outcome). Rounded ISCVs are contested, back-calculated ISCVs are contested, occasionally one wonders whether anything would not be contested (dropout rate anyone?) We had a recent pivotal that failed with an observed GMR around 0.88, and we are now wondering whether reaching for theta0 = 0.90 in the repeat study is a sensible pre-emptive move or merely an invitation to a different flavour of a deficiency letter. EAEU experience also welcome, though I suspect the relevant adjective there is 'creative' rather than 'consistent'.

Thanks,
Marty

Helmut
★★★

Vienna, Austria,
2026-05-12 16:55
(22 d 09:52 ago)

@ Martynkf
Posting: # 24617
Views: 1,154

Sample size with theta0 < 0.95: Why not?

Post reply

Hi Marty,

❝ Long time lurker, first time questioner here! Hi everybody!

Welcome to the club!

❝ A regulatory-reception question rather than a statistical one, but I hope it would still be welcome.

Of course.

❝ […] My read of ICH M13A is that no particular value of theta0 is prescribed, only that the assumption be justified.

Correct. Not even sure whether it has to be justified, rather than there should be “an appropriate sample size determination”.
This rubber clause was 1:1 pasted from the EMA’s 2010 guideline, where itself was taken from ICH E9 of 1998.

❝ The adjacent thread already covers most of the statistical ground […] Among practitioners the matter seems quite settled.

Correct again.

❝ […] Has anyone here run into pushback from EU agencies — EMA via DCP/MRP, or national authorities — when an ABE protocol used an assumed deviation larger than 5%, say theta0 = 0.90, outside the HVD/NTID setting? Statistically that is the conservative direction (n inflated, power preserved against a less flattering outcome), and a deficiency on principle would be hard to justify. But principle and practice may differ.

Not me.

❝ […] some Central European authorities I deal with routinely question the sample size assumptions in ways that suggest they are overly strict in analysing the a priori assumptions post hoc (regardless of outcome). Rounded ISCVs are contested, back-calculated ISCVs are contested, occasionally one wonders whether anything would not be contested (dropout rate anyone?)

The dreadful post hoc (a posteriori, retrospective) power entering through the backdoor? Excuse my French – WTF?
In BE – according to all global guidelines – there is no place for Bayesian priors / posteriors. Frequentists^I define hypotheses (H₀ = inequivalence, H₁ = equivalence), test them with a pre-defined $\alpha$, and get a dichotomous outcome (pass | fail).^II That’s all, end of the story.

❝ We had a recent pivotal that failed with an observed GMR around 0.88, and we are now wondering whether reaching for theta0 = 0.90 in the repeat study is a sensible pre-emptive move or merely an invitation to a different flavour of a deficiency letter.

Did you plan the sample size of the failed study for an assumed T/R-ratio of 0.95 or on what? This 0.88 is the ‘best’ estimate you have right now. Planning the next study for 0.90 is somewhat risky. You know that power curves (and thus the sample sizes) are most sensitive to the T/R-ratio. n for 0.88 ~~might~~ will be substantial larger than the n for 0.90. If you have to deal with an agency already notoriously questioning your assumptions, even with a passing study you may open a can of worms. Maybe of interest a sneak-preview of a presentation about repeating studies for the upcoming BioBridges. Talk to the guy in the Armani suit.

❝ EAEU experience also welcome, though I suspect the relevant adjective there is 'creative' rather than 'consistent'.

No practical experiences but I’ve heard that they have some funky points of view.

Everybody is a Bayesian.
It’s just that some know it, and some don’t. Trivellore Raghunathan
Hardcore statisticians don’t care about the alternative hypothesis H₁ (equivalence).
If H₀ is not rejected → fail and if it is rejected → pass. Although the latter implies that H₁ is accepted, they would not necessarily state it as such.
Using the confidence inclusion approach according to the guidelines (i.e., by assessing the $\small{100\left(1-2\,\alpha\right)}$ CI of the PE and the pre-specified BE margins), we get more information:
1. CI entirely within the margins
  → pass and equivalence proven with $\small{\alpha}$
2. At least one confidence limit outside the margins
  → fail (inconclusive, underpowered)
3. CI entirely outside the margins
  → fail and inequivalence proven with $\small{\alpha}$
Only in the second case you could consider repeating the study in a larger sample size.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

zizou
★

Plzeň, Czech Republic,
2026-05-22 18:11
(12 d 08:35 ago)

@ Helmut
Posting: # 24629
Views: 598

Repeating Studies

Post reply

Dear Helmut,

❝ Maybe of interest a sneak-preview of a presentation about repeating studies for the upcoming BioBridges.

I sneaked to the preview of the presentation about repeating studies - interesting topic.

After looking at slide 11, i started to be interested also in case with not bioequivalent formulations (related to Type I Error - incorrectly reject true null hypothesis of bioinequivalence, i.e. incorrectly conclude bioequivalence for not bioequivalent formulations). Just theoretical case with poor test formulation - not bioequivalent with reference, assumed true ratio 0.8 - to get power not more than 5 % (as TIE).
So (not suggested) scenario is to repeat bioequivalence study several times until the success and to see probability when "success" could happen.

    n      p

    1   0.0500

    2   0.0975

    3   0.1426

    4   0.1855

    5   0.2262

    6   0.2649

    7   0.3017

    8   0.3366

    9   0.3698

   10   0.4013

   11   0.4312

   12   0.4596

   13   0.4867

   14   0.5123

   15   0.5367

   16   0.5599

   17   0.5819

   18   0.6028

   19   0.6226

   20   0.6415

   21   0.6594

   22   0.6765

   23   0.6926

   24   0.7080

   25   0.7226

   26   0.7365

   27   0.7497

   28   0.7622

   29   0.7741

   30   0.7854

   31   0.7961

   32   0.8063

n ... Try No.

p ... Probability of BE Concluded

Note: p = 1 - 0.95^n (calculation is similar like with a dice, to roll at least one six from several tries, all tries have probability 5/6 to not achieve six, in case above probability is 0.95 to not achieve BE in one try).

Best regards,
zizou

Helmut ★★★ Vienna, Austria, 2026-05-23 10:30 (11 d 16:17 ago) @ zizou Posting: # 24631 Views: 569	Buongiorno, signor Bonferroni! Post reply
	Hi zizou, I didn’t think about the Type I error! THX for point it out. — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes

ElMaestro ★★★ Denmark, 2026-05-25 19:13 (9 d 07:33 ago) @ Helmut Posting: # 24632 Views: 465	Buongiorno, signor Bonferroni! Post reply
	Well, if you see Šidák, pass on my sincere greetings while you're at it. — Pass or fail! ElMaestro

ElMaestro
★★★

Denmark,
2026-05-13 03:26
(21 d 23:21 ago)

@ Martynkf
Posting: # 24618
Views: 1,083

Sample size with theta0 < 0.95: any regulatory pushback?

Post reply

Hi Martynkf,

❝ The question is the assessors' side. Has anyone here run into pushback from EU agencies — EMA via DCP/MRP, or national authorities — when an ABE protocol used an assumed deviation larger than 5%, say theta0 = 0.90, outside the HVD/NTID setting? Statistically that is the conservative direction (n inflated, power preserved against a less flattering outcome), and a deficiency on principle would be hard to justify. But principle and practice may differ.

Great question. And wise words from Hötzi.
I did not have sample size calcs questioned that way. Whatever is assumed can be justified, so is my empirical experience. For a tyrosine kinase inhibitor known to misbehave a deviation of 12.5% on the GMR was accepted as far as I recall.

I'd like to bring one perspective into this: Potency correction is allowed by M13a and other guidelines. The only reason it is there is that regulators are fully aware that sometimes products differ by (much) more than 5%. Obivously no IVIVC is ever a perfect 1:1, so an x% difference on the COAs can translate into an in vivo deviation (obs GMR) or more than or less than x%. The whole reason we conduct in vivo studies is that we can't well predict in vivo behaviour. So, I'd say it would be almost unethical to have to always assume a deviation of max 5%.

Add to this that in BE, by convention we use a parametric model. We do not test for normality, we just assume it. This has unknown consequences for the evaluation. It can go both ways, depending on how the data is actually distributed (for the same reason I am not huge a fan of saying that testing for XYZ in BE is useless because this or that estimator is biased). So, we have a bunch of assumptions, plus a bunch of uncertainty. A deviation of more than 5% is certainty justified in many cases.

And in real life: The guy in the Armani suit dictates that the head of clin ops can use up to -certainly not more than- $XYZ on the trial. Now sample size derivation becomes a matter of identifying the worst GMR that keeps the sample size within the budget, given some kind of CV. That is unfortunately how it often works in practice. And that part of the story is never disclosed to regulators in a dossier.

—
Pass or fail!
ElMaestro

Martynkf
☆

Budapest, Hungary,
2026-05-13 10:27
(21 d 16:20 ago)

@ ElMaestro
Posting: # 24619
Views: 1,091

Sample size with theta0 < 0.95: any regulatory pushback?

Post reply

Thank you both for wise counsel!

To add some spice to the situation, this failed pivotal came after a pitch-perfect pilot with GMR ~1, ISCV ~18% (same as in the literature) on a BCS III molecule (although with a nice 2-compartmental kinetics).

The failed pivotal reported a ~27% ISCV, with this bad GMR of ~0.88. You can show outliers with Cook's distance, DFFITS, QQplot etc., but in my practice you can show these things in most studies (even ones which pass) and they should be accepted rather than relied upon in any meaningful way unless you have something damning on the subjects themselves.

The CMC people of course swear that the products from the pilot and the pivotal are practically the same.

My thought is that there is a kind of survival bias at play here. The study was powered at 80%, and we wouldn't be discussing it if it passed. If the study fails, the reported GMR and/or ISCV tends to be bad. Sorry because this may slightly muddle the frequentist and baysean realms :-(

The Armani-pressure is in the other way in this case… and if I take the ISCV and the GMR as read from the failed study I'm looking at 100+ subjects for said molecule and even I start to get nervous :lookaround:

I intend to take the orig. reported ISCVs, treat the failed ISCV as an outlier and dismiss it (I guess Helmut would advise me to pool them based on the chi-sq. distribution), power the new study at 90% with standard 95% theta0 and show the GMR sensitivity plot to the Armani people.

Thanks again, didn't want to leave you guys without an update!

Helmut
★★★

Vienna, Austria,
2026-05-13 10:53
(21 d 15:54 ago)

@ Martynkf
Posting: # 24620
Views: 1,090

More information, please

Post reply

Hi Marty,

before diving into the details of your post, can you give further information?

theta0 you used in planning the pivotal study, target power (CV 0.18, right?)
Number of eligible subjects in the failed study (CV 0.27, PE 0.88)

Before talking to the guy in the Armani suit see the Bayesian stuff for sample size estimation based on a previous study and statistical assurance.*

Ring A, Lang B, Kazaroho C, Labes D, Schall R, Schütz H. Sample size determination in bioequivalence studies using statistical assurance. Br J Clin Pharmacol. 2019; 85(10): 2369–77. doi:10.1111/bcp.14055. Open access.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Martynkf
☆

Budapest, Hungary,
2026-05-13 12:27
(21 d 14:20 ago)

@ Helmut
Posting: # 24621
Views: 1,031

More information, please

Post reply

❝ before diving into the details of your post, can you give further information?

❝ 1. theta0 you used in planning the pivotal study, target power (CV 0.18, right?)

theta0 = .95, confirmed. targetpower = .8, dropout rate 10% (PowerTOST_output/0.9)

❝ 2. Number of eligible subjects in the failed study (CV 0.27, PE 0.88)

27 (this is its own can of worms right?)

❝ Before talking to the guy in the Armani suit see the Bayesian stuff for sample size estimation based on a previous study and statistical assurance.*

Thanks! I don't routinely use the Baysean stuff, and I like the assurance framework!

Helmut
★★★

Vienna, Austria,
2026-05-13 15:25
(21 d 11:21 ago)

@ Martynkf
Posting: # 24622
Views: 1,034

Still 🤔

Post reply

Hi Marty,

sorry, I’m not sure whether I do understand your values.

❝ theta0 = .95, confirmed. targetpower = .8, dropout rate 10% (PowerTOST_output/0.9)

library(PowerTOST) nadj <- function(n, do.rate, nseq = 2) { # adjusted sample size (balanced sequences) x <- n / (1 - do.rate) return(as.integer(nseq * (x %/% nseq + as.logical(x %% nseq)))) } CV <- 0.18 # assumed theta0 <- 0.95 # assumed target <- 0.80 # target (desired) power do.rate <- 0.10 # anticipated dropout-rate 10% n <- sampleN.TOST(CV = CV, theta0 = theta0, targetpower = target, design = "2x2")[["Sample size"]] +++++++++++ Equivalence test - TOST +++++++++++ Sample size estimation ----------------------------------------------- Study design: 2x2 crossover log-transformed data (multiplicative model) alpha = 0.05, target power = 0.8 BE margins = 0.8 ... 1.25 True ratio = 0.95, CV = 0.18 Sample size (total) n power 16 0.820357 cat("adjusted sample size =", nadj(n, do.rate), "\n") adjusted sample size = 18

❝ ❝ 2. Number of eligible subjects in the failed study (CV 0.27, PE 0.88)

❝ 27 (this is its own can of worms right?)

Given the sample size estimation above, how did you end up with 27 eligible in the study? Even if you would have targeted 90% power, we get n = 22 and nadj = 26.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Martynkf
☆

Budapest, Hungary,
2026-05-15 10:02
(19 d 16:45 ago)

@ Helmut
Posting: # 24623
Views: 965

Still 🤔

Post reply

Hi Helmut!

❝ ❝ sorry, I’m not sure whether I do understand your values.

Integrity, curiosity and openness mostly :flower:

❝ Given the sample size estimation above, how did you end up with 27 eligible in the study? Even if you would have targeted 90% power, we get n = 22 and nadj = 26.

There were some divergent literature values and the samp.size calculation was based on that bigger value which ended up being ~30 subjects -> 27 evaluable. I was not involved in that stage.

Sample size with theta0 < 0.95: any regulatory pushback? [Regulatives / Guidelines]

Sample size with theta0 < 0.95: Why not?

Repeating Studies

Buongiorno, signor Bonferroni!

Buongiorno, signor Bonferroni!

Sample size with theta0 < 0.95: any regulatory pushback?

Sample size with theta0 < 0.95: any regulatory pushback?

More information, please

More information, please

Still 🤔

Still 🤔