PASS 2020: Outcome [Software]

posted by Helmut Homepage – Vienna, Austria, 2020-05-21 20:43 (1433 d 07:32 ago) – Posting: # 21456
Views: 4,458

Dear all,

in the following my observations/conclusion about PASS2020 v20.0.1 (released 2020-02-10). I checked only the sample size procedures relevant for ABE. CV 0.1–0.4 (Δ 0.02), 0.5, 0.75, 1.0; θ0 0.85–1.00 (Δ 0.05); AR {0.8000|1.2500}; target power 0.8 and 0.9. I compared the results of PASS with the exact method of PowerTOST and the SAS-code for the noncentral t-distribution given by Jones & Kenward (2000) ported to R. Not surprisingly in all of my 1,152 scenarios the exact method agreed with the noncentral t. PASS not so much…

Paired samples

The design for ratios is not directly accessible in PASS (only for differences). Novices (aka “Push-the-button statisticians”) might not know how to set it up based on logs and conclude that is not possible.
PASS reports the sample size / group. So far so good. Though nobody would start a study with unequal group sizes (which give at least the desired power) and round up to the next even anyhow, there are many cases were 2×n of PASS is larger than already even (total) sample sizes by the exact method and the noncentral t. With a few exceptions (4 of my 128 scenarios) the sample sizes were larger than necessary (x̃ +2.11%, range –0.07 to +33.3%). In the most common area of CV 0.2–0.3, θ0 0.95, power 0.8: x̃ +6.27%. Why? Duno.


Accessible twice. Under the Cross-Over (Higher-Order) Designs and Cross-Over (2×2) Design. OK, why not. However, the results differ: For θ0 0.85, CV 1, power 0.8, I got in the former 2,334 (unless I ask for the exact sample size) and in the latter only 2,333. Likely most people use the latter and round up to next even number to get balanced sequences. Looks stupid if the output is part of the SAP.


Accessible under the Cross-Over (Higher-Order) Designs. I’m not happy with the terminology of replicate studies used in PASS. Though Chen et al. (1998) used “Higher-Order” that’s rather unusual. Generally Higher-Order refers to more than two treatments.
Acc. to Chinese Whispers a sponsor had lengthy discussions with a “statistician” of a CRO using PASS. Since only ABBA|BAAB is given in Design setup and the manual, he insisted of using this one. Well, that’s uncommon. All regulatory agencies give TRTR|RTRT in their guidelines…

  1. If you perform the study as TRRT|RTTR, statistically all is good but likely you have to deal with questions from assessors (who are rarely statisticians).
  2. If you perform the study as ABAB|BABA to make regulators happy, of course you could use the sample size estimated in PASS because
    • there are actually three 4-period 2-sequence replicate designs, namely ABAB|BABA, ABBA|BAAB and AABB|BBAA and
    • all of them have the same design constants and degrees of freedom.
    • Hence, in three hypothetical studies with the same effects one would observe exactly the same point estimates and residual variance.
    • But: Imagine a picky assessor discovering the output of PASS in the SAP stating ABBA|BAAB, whilst the study was performed as ABAB|BABA. Questions on the way, again.
    As we observed before, the 2×2×4 is beyond repair. I can only assume that the SE instead of the CV is used. Possibly there are still problems with the dfs and design constant to calculate the SEM. In 95 of my 128 scenarios the sample size was too large. If I assess only studies with n≥12, x̃ +4.35%, range ±0 to +33.3%. Example: θ0 0.95, CV 0.2, power 0.9: PASS estimates 16 though only 12 are needed.
2×2×3 (TRT|RTR, TRR|RTT)

Terminology again. Following Chen et al. only the “Three-Period, Two-Sequence Dual ABB|BAA” given. Sigh. More or less OK. Following the EMA’s Q&A and assessing only studies with at least 24 subjects: x̃ ±0%, range –0.65 to +7.69%.


Generally OK. Assessing only scenarios with n≥12: x̃ ±0%, range ±0% to +33.3%.

2×4×2 (TR|RT|TT|RR, Balaam’s design)

Generally OK. Assessing only scenarios with n≥24: x̃ ±0%, range –0.54% to +5.26%.

Higher-Order Designs (Latin Squares and Williams’ designs) for ratios are not implemented.
Reference-scaling not implemented.

We must no forget that any interventional trial carries some degree of risk. Hence, ICH E9 „Statistical Principles for Clinical Trials” stated already in 1998

The number of subjects in a clinical trial should always be large enough to provide a reliable answer to the questions addressed.

Large enough, not larger

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
 Admin contact
22,993 posts in 4,828 threads, 1,656 registered users;
69 visitors (0 registered, 69 guests [including 7 identified bots]).
Forum time: 04:15 CEST (Europe/Vienna)

So far as I can remember,
there is not one word in the Gospels
in praise of intelligence.    Bertrand Russell

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz