Ohlbe ★★★ France, 20201204 13:04 

Dear all, The WHO has published a 7page document listing frequent deficiencies in BE protocols, reports and practices. Helmut will be happy with at least one recommendation: no need to calculate and report posthoc power. I expect some others to stir some discussions here (e.g. 2stage design) — Regards Ohlbe 
ElMaestro ★★★ Denmark, 20201204 15:10 

Thanks Ohlbe, does anyone know what this sentence means: "The calculation of the 90% confidence interval (CI) of the mean test/comparator ratio for the primary PK parameters should not be confused with the two onesided ttests employed to reject the null hypothesis of nonequivalence. The end result is the same, but these are not the same calculations." — Pass or fail! ElMaestro 
Helmut ★★★ Vienna, Austria, 20201204 21:30 

Hi ElMaestro, ❝ does anyone know what this sentence means: Yes. ❝ "The calculation of the 90% confidence interval (CI) of the mean test/comparator ratio for the primary PK parameters‚… $$\small{H_0:\frac{\mu_\textrm{T}}{\mu_\textrm{R}}\notin \left [ \theta_1, \theta_2 \right]\:vs\:H_1:\theta_1<\frac{\mu_\textrm{T}}{\mu_\textrm{R}}<\theta_2}\tag{1}$$ ❝ … should not be confused with the two onesided ttests employed to reject the null hypothesis of nonequivalence… $$\small{H_\textrm{0L}:\frac{\mu_\textrm{T}}{\mu_\textrm{R}} \leq \theta_1\:vs\:H_\textrm{1L}:\frac{\mu_\textrm{T}}{\mu_\textrm{R}}>\theta_1}\tag{2a}$$ $$\small{H_\textrm{0U}:\frac{\mu_\textrm{T}}{\mu_\textrm{R}} \geq \theta_2\:vs\:H_\textrm{1U}:\frac{\mu_\textrm{T}}{\mu_\textrm{R}}<\theta_2}\tag{2b}$$ ❝ … The end result is the same, but these are not the same calculations." Exactly! For decades global guidelines ask for for the confidence interval inclusion approach \(\small{(1)}\). In Schuirmann’s famous TOST procedure \(\small{(2)}\) one gets two pvalues; one for \(\small{(2\textrm{a})}\) and another one for \(\small{(2\textrm{b})}\). Nothing else. BE is concluded if both Nulls are rejected. I think that statisticians of the WHO are fed up reading in \(\frac{\mathfrak{protocols}}{\mathfrak{reports}}\) … Bioequivalence \(\frac{\textrm{will be}}{\textrm{has been}}\) assessed by the TwoOneSided Tests procedure (Schuirmann 1987). … only to find the 90% CI in the report.BTW, only once I have seen TOST performed (in 1991). Lead to a deficiency letter: “The applicant should provide the 90% CI.” Chow and Liu^{1} erred when stating The two onesided t tests procedure is operationally equivalent* to the classic (shortest) confidence interval approach; that is, if the classic (1–α)×100% confidence interval for µ_{T}–µ_{R} is within (θ_{L}, θ_{U}), then both H_{01} and H_{02} are also rejected at the α level by the two onesided t tests procedure. I would not go that far like Brown et al.^{2} stating thatThis similarity [between level α TOSTs and a 1–2α CI] is somewhat of a fiction, based more on an algebraic coincidence rather than a statistical equivalence. The misconception that sizeα bioequivalence tests generally correspond to 100(1–2α)% confidence sets […] lead[s] to incorrect statistical practices, and should be abandoned. When reviewing stuff I insist in deleting the – all too common – TOSTstatement as well (i.e., claiming \(\small{(2)}\) whilst performing \(\small{(1)}\)).
d_labes ★★★ Berlin, Germany, 20201205 17:19 

Dear Helmut, if the CI inclusion rule is really something different than TOST why do we calculate power / samplesize based on TOST . — Regards, Detlew 
ElMaestro ★★★ Denmark, 20201205 17:49 

Haha, ❝ if the CI inclusion rule is really something different than TOST why do we calculate power / samplesize based on TOST . d_labes, you beat me to it, I was going to ask a similar question. If I am getting it right this isn't about whether your product passes the test for BE or not, it is purely a matter relating to what you call it. Since semantics is now of such importance, I believe PowerTOST needs to be renamed. Now someone kindly define robust and robustness for me so that I understand it. And please tell me how to use that definition to make a simulation robust enough that it convinces WHO Or more generally, if I want to present an argument to a WHO regulator (not a simulation but just an argument which may or may not be based on siumulation) in which way will I know away to make my argument robust? — Pass or fail! ElMaestro 
d_labes ★★★ Berlin, Germany, 20201207 12:39 

Dear ElMaestro, ❝ If I am getting it right this isn't about whether your product passes the test for BE or not, it is purely a matter relating to what you call it. Since semantics is now of such importance, I believe PowerTOST needs to be renamed. Can't understand what the Dichter damit sagen will . — Regards, Detlew 
Helmut ★★★ Vienna, Austria, 20201206 23:09 

Dear Detlew, ❝ if the CI inclusion rule is really something different than TOST why do we calculate power / samplesize based on TOST . Cause you decided to baptize the package PowerTOST and not “Eierlegende Wollmilchsau” back in 2009. Given, Donald used the phrase “operationally identical” on p.661 (right column, 2^{nd} paragraph). However, for me (!) those are two different “operations”. Results of an example:
d_labes ★★★ Berlin, Germany, 20201207 12:11 

Dear Helmut, ❝ ... ❝ Given, Donald used the phrase “operationally identical” on p.661 (right column, 2^{nd} paragraph). ❝ ❝ However, for me (!) those are two different “operations”. Results of an example: ❝ ❝ ❝ ❝ ❝ ❝ Of course the two calculations are different, no doubt about it. I have understood “operationally identical” always as the fact that TOST and CI inclusion give the same answer with regard to the BE decision. IMHO this is the meaning of the paragraph on page 661 in Donalds famous paper containing “operationally identical”: "The two onesided tests procedure turns out to be operationally identical to the procedure of declaring equivalence only if the ordinary 1  2α (not 1α) confidence interval for µTµR is completely contained in the equivalence interval [θ_{1}, θ_{2}]". Emphasis by me. — Regards, Detlew 
Helmut ★★★ Vienna, Austria, 20201207 14:02 

Dear Detlew, ❝ Of course the two calculations are different, no doubt about it. Like \(\small{2+2+2+2=2\times4=2^3=8}\). Different calculations, same result. ❝ I have understood “operationally identical” always as the fact that TOST and CI inclusion give the same answer with regard to the BE decision. Acc. to Berger and Hsu not sure about the “always”. But that’s another story and of historical interest only. ❝ IMHO this is the meaning of the paragraph on page 661 in Donalds famous paper containing “operationally identical”: ❝ "The two onesided tests procedure turns out to be operationally identical to the procedure of declaring equivalence only if the ordinary 1  2α (not 1α) confidence interval for µTµR is completely contained in the equivalence interval [θ_{1}, θ_{2}]". ❝ Emphasis by me. Correct. Also in Chow and Liu (p.98): The two onesided t tests procedure is operationally equivalent to the classic (shortest) confidence interval approach; that is, if the classic (1–2α)100% confidence interval for μ_{T}–μ_{R} is within (θ_{L}, θ_{U}), then both H_{01} and H_{02} are also rejected at the α level by the two onesided t tests procedure. Coming back to the WHO’s rant: The calculation of the 90% confidence interval (CI) of the mean test/comparator ratio for the primary PK parameters should not be confused with the two onesided ttests employed to reject the null hypothesis of nonequivalence. The end result is the same, but these are not the same calculations. IMHO, they are just fed up reading “TOST” whilst the CI inclusion approach acc. to the GL was actually performed.
ElMaestro ★★★ Denmark, 20201207 15:54 

Will they be cracking down on spelling errors too, you think? — Pass or fail! ElMaestro 
d_labes ★★★ Berlin, Germany, 20201207 16:35 

Dear Helmut, ❝ Coming back to the WHO’s rant: The calculation of the 90% confidence interval (CI) of the mean test/comparator ratio for the primary PK parameters should not be confused with the two onesided ttests employed to reject the null hypothesis of nonequivalence. The end result is the same, but these are not the same calculations. ❝ IMHO, they are just fed up reading “TOST” whilst the CI inclusion approach acc. to the GL was actually performed. Totally correct to lament about that fact, I think. It should unequivocally described in the protocol or the SAP which calculations will be done . The CI approach will be the favorite I think. It is requested in all guidelines about BE studies, if I dont err. — Regards, Detlew 
Helmut ★★★ Vienna, Austria, 20201222 20:00 

Dear Ohlbe, ❝ The WHO has published a 7page document listing frequent deficiencies in BE protocols, reports and practices. ❝ ❝ Helmut will be happy with at least one recommendation: no need to calculate and report posthoc power. Oh yes, indeed. More in detail (my comments in blue): [list=n][*]METHOD OF ADMINISTRATION
[*]SAMPLING TIMES
[*]PHARMACOKINETIC ANALYSIS
[*]SAMPLE SIZE CALCULATION ESTIMATION, if you don’t mind.
[*]STATISTICAL ANALYSIS
[*]EXCLUSION OF DATA
❝ I expect some others to stir some discussions here (e.g. 2stage design) Correct. Not only that. 
ElMaestro ★★★ Denmark, 20201223 09:12 

Hi all, ❝
This is a very tricky way to write something for clarification. It is like they give you a choice  Do you want to randomise or do you want to "obtain balanced groups" ? On average (whatever the hell that means) plain randomisation assures balance, but randomisation does not in any individual case guarantee balance. Perhaps they meant to combine stratification with randomisation. Or what? Had they written "it is extremely important to aim for balanced groups" then I'd get the point. But that would not change anything from today's practice, as all CRO's to the best of my knowledge are randomising. What is it they really tried to clarify? And how do I do comply? — Pass or fail! ElMaestro 