Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-05-21 00:39
(1408 d 01:03 ago)

Posting: # 21455
Views: 22,643
 

 Sample Size Estimation in Bioequivalence? [Surveys]

Dear all,

on the occasion of recent discussions about software used for sample size estimation in BE I created a

Survey

with ten questions.

Attendance is anonymous and limited to one participation per device (i.e., same IP-address). If your institution has only one IP (like Novartis redirecting its 120,000+ employees to a single one in Basel…), sorry.
The survey should take about three minutes to complete.

The questions are:
  1. How do you estimate the sample size?
    ○ Sample size tables
    ○ Software
    ○ Both
  2. Which software do you use?
    ☐ Commercial (off-the-shelf, e.g., SAS Proc Plan, NQuery Advisor, PASS, StudySize, …)
    ☐ Open source (e.g., R-packages like PowerTOST, bear, …)
    ☐ Free (e.g., FARTSSIE, EFG, …)
    ☐ Web-based
    ☐ In-house (e.g., own SAS-macros, R, C, Excel-template, …)
    ☐ Optional: Please give the software you use most (incl. version, year of release)
      ┌──────────┐
      └──────────┘
  3. How often do you update the software you use most?
    ○ Never
    ○ Occasionally
    ○ Regularly
    ○ I don’t want to disclose this information
  4. Is the software you use most validated?
    ☐ IQ (Installation Qualification acc. to procedures provided by the vendor)
    ☐ OQ Type 1 (Operational Qualification acc. to procedures provided by the vendor)
    ☐ OQ Type 2 (Operational Qualification acc. to own pre-specified procedures)
    ☐ PQ (Performance Qualification)
    ☐ Comparison with sample size tables
    ☐ Cross-validated with other software
    ☐ Partly (i.e., only some of the procedures)
    ☐ No
    ☐ I don’t want to disclose this information
    ☐ Other approach (please specify)
  5. Were you ever asked by a regulatory agency about software validation?
    ☐ Yes
    ☐ No
    ☐ I don’t want to disclose this information
    ☐ Optional: If you answered “Yes”, please give the year
      ┌─────┐
      └─────┘
  6. Do you repeat the estimation in-house if provided by an external entity (CRO, sponsor, consultant)?
    ☐ Always
    ☐ Regularly
    ☐ Sometimes
    ☐ Never
  7. Do you perform a Sensitivity Analysis in order to assess the impact on power if in the study values (e.g., T/R-ratio, CV, number of dropouts) will deviate from assumptions?
    ○ Never
    ○ Sometimes
    ○ Always
    ○ I don’t know what a Sensitivity Analysis is
    ○ I don’t want to disclose this information
  8. Do you increase the estimated sample size according to the expected dropout rate?
    ○ Yes (formula: n’ = n × (100 + dropout-rate in %) / 100)
    ○ Yes (formula: n’ = n / (100 – dropout-rate in %) × 100)
    ○ Yes (as provided by the software; I don’t know the formula)
    ○ Yes (chosen by management)
    ○ No (since the impact on power is limited)
  9. Please give general problems that you faced in sample size estimation.
    ☐ Estimated sample size was substantially smaller/larger than expected
    (compared to PARs / other studies)

    ☐ Result of re-assessment differed from the estimate given (by CRO, sponsor, consultant)
    ☐ Software, version, setup not given (by CRO, sponsor, consultant)
    ☐ Other (please give a short description)
      ┌───────────────────────────────┐
      └───────────────────────────────┘
  10. Did you face problems with the software you use most?
    ☐ Planned design not available
    ☐ Only one design-variant provided (although alternatives exist)
    ☐ Methods based on simulations not reproducible (e.g., for reference-scaling)
    ☐ Operation is complicated
    ☐ User manual insufficient
    (too short/verbose, methods not/poorly documented, lacking/outdated references, …)

    ☐ No
    ☐ Other (please specify)
      ┌───────────────────────────────┐
      └───────────────────────────────┘
Feel free to participate. It could help all of us to understand the current landscape. If you close the browser in the middle of the survey, you can come back later – from the same device – to complete it.
If sample size estimation is not your cup of tea, consider inviting a responsible colleague.

I will post results mid June.

In the meantime I suggest [image] this one.


Edit: 101 respondents as of 2020-10-23. Average time taken 3:30 minutes. THX!

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-06-03 14:36
(1394 d 11:06 ago)

@ Helmut
Posting: # 21497
Views: 14,696
 

 Evaluation ☑️

[image]Dear all,

here are the results of 101 respondents as of 23 October.
THX to all participants (the survey is closed). 1

Below the questions percentage of complete answers in decreasing order. Since in some questions multiple choices are possible, the percentage can be >100%. Not disclosed information excluded.

Below each question my very personal opinions on the outcome.
[list=1][*]How do you estimate the sample size?
○ Software

    64%
○ Both
    34%
○ Sample size tables
      2%
  • If one uses solely sample sizes tables, how to deal with not available combinations (T/R-ratio, CV, power)? Power is highly nonlinear and therefore, interpolation difficult.

[*]Which software do you use?
☐ Open source (e.g., R-packages like PowerTOST, bear, …)

    53%
☐ Free (e.g., FARTSSIE, EFG, …)
    44%
☐ Commercial (off-the-shelf, e.g., SAS Proc Plan, NQuery Advisor, PASS, StudySize, …)
    31%
☐ In-house (e.g., own SAS-macros, R, C, Excel-template, …)
    14%
☐ Web-based
      3%
☐ Optional: Please give the software you use most (incl. version, year of release)
    PowerTOST (5), FARTSSIE (3), SAS (3), PASS (2), Julia: ClinicalTrialUtilities (1), NQuery (1), SPSS (1),
    Statistics101 Resampling Simulator (1), WinNonlin (1)
    Open source and free software 2 is in the lead. Only because it comes at no cost? No commercial software provides sample sizes for the reference-scaling methods out of the box (though it is possible to write code in SAS or MatLab). One respondent simulated higher-order designs and incomplete block designs (>4 treatments), which are not available in any software.
    Phoenix WinNonlin provides only post hoc power. Hence, how is the sample size estimation done?
    Would be interesting which web-based method is used. I know only ones for parallel groups based on the large sample approximation.

[*]How often do you update the software you use most?
○ Occasionally

    41%
○ Regularly
    36%
○ Never
    20%
  • [image]It is not a good idea to never update the software (20%). Not only that bugs might have been corrected in the meantime, new methods might have been added as well (note that in Q10 20% reported that a planned design is not available).
    Given the paranoia of IT departments, in some companies updating may be a cumbersome task. However, try to be not more than two releases behind the current one. The worst I ever have seen was a fifteen years old one…

[*]Is the software you use most validated?
☐ Cross-validated with other software

    30%
☐ Comparison with sample size tables
    29%
☐ IQ (Installation Qualification acc. to procedures provided by the vendor)
    23%
☐ No
    23%
☐ PQ (Performance Qualification)
    16%
☐ OQ Type 1 (Operational Qualification acc. to procedures provided by the vendor)
    14%
☐ Partly (i.e., only some of the procedures)
    10%
☐ OQ Type 2 (Operational Qualification acc. to own pre-specified procedures)
      6%
☐ Other approach (please specify)
      5%
  • Interesting that almost ¼ of respondents reported that the software is not validated. Only fine if you are a regulator yourself. If you are a sponsor I would require that the CRO’s is. If you are with a CRO, I recommend to do something (at least compare with sample size tables). Some respondents answered that the software passed IQ and OQ Type 1.
    I would never trust in validation routines provided by the vendor alone (a.k.a. rubbish in, rubbish out).

[*]Were you ever asked by a regulatory agency about software validation?
☐ No

    82%
☐ Yes
    15%
☐ Optional: If you answered “Yes”, please give the year
    Four answers: 2019 (2), 2018, 2017
  • Seemingly regulators don’t care much. IMHO, regulators and members of IECs should (in assessing the protocol). According to ICH E9 “The number of subjects in a clinical trial should always be large enough to provide a reliable answer to the questions addressed.” Seemingly some – flawed – software routines give a higher than required sample size. That’s nice for the applicant but ethically questionable.

[*]Do you repeat the estimation in-house if provided by an external entity (CRO, sponsor, consultant)?
☐ Always

    55%
☐ Regularly
    21%
☐ Sometimes
    17%
☐ Never
      7%
  • Trust is good, control is better. (Russian proverb)
    I suggest to always repeat the estimation. Takes a minute and prevents surprises later.

[*]Do you perform a Sensitivity Analysis in order to assess the impact on power if in the study values (e.g., T/R-ratio, CV, number of dropouts) will deviate from assumptions?
○ Always

    37%
○ Sometimes
    34%
○ Never
    14%
○ I don’t know what a Sensitivity Analysis is
    11%
  • A sensitivity analysis is recommended by ICH E9 (Section 3.5) and E9(R1). The 14% reporting to never perform one are possibly believers of the “carved in stone” approach (i.e., that the assumed values are true ones and will be exactly realized in the study). That’s extremely risky, esp. if the T/R-ratio will turn out to be worse than assumed. The impact on power is massive. Hence, I suggest to perform always a sensitivity analysis (as 37% already do).
    If you use PowerTOST, I recommend its functions pa.ABE() and pa.scABEL() as a starter. Examples are given in the vignette.

[*]Do you increase the estimated sample size according to the expected dropout rate?
○ Yes (chosen by management)

    37%
○ Yes (formula: n’ = n × (100 + dropout-rate in %) / 100)
    29%
○ Yes (formula: n’ = n / (100 – dropout-rate in %) × 100)
    22%
○ Yes (as provided by the software; I don’t know the formula)
      9%
○ No (since the impact on power is limited)
      3%
  • Bad that the management increases the sample size (based on gut feeling, reading tea leaves, budget, or what?). IMHO, FARTSSIE’s Boss Button is not a good idea. To quote Stephen Senn:
         Power. That which statisticians are always calculating but never have.
    Regrettably 29% fell into the trap and used the wrong formula to adjust the sample size n’ (if one faces exactly the anticipated dropout-rate and all other assumptions are correct, the number of eligible subjects can be too low). The dropout-rate is based on dosed subjects and hence, 22% used the correct formula (see also this post).
    For the 9% don’t knowing what the software does: At least in PASS the correct formula is implemented.

[*]Please give general problems that you faced in sample size estimation.
☐ Estimated sample size was substantially smaller/larger than expected
(compared to PARs / other studies)

    41%
☐ Result of re-assessment differed from the estimate given (by CRO, sponsor, consultant)
    37%
☐ Software, version, setup not given (by CRO, sponsor, consultant)
    28%
☐ Other (please give a short description)
    10%
  • Differences to PARs are not uncommon and to other studies (if not your own) as well. The clinical setup might differ as does the bioanalytical method. Another good reason to perform a sensitivity analysis. Perhaps own assumptions were too optimistic? Sometimes own estimations give a much larger sample size than the study of the PAR had. It might well be that it passed by sheer luck. Always check the power of the study.
    Amazing that the sample size estimation differed from the external one in so many cases. If information is not given, one is fishing in the dark (which one is correct?). Ask for it.

[*]Did you face problems with the software you use most?
☐ No

    56%
☐ Planned design not available
    20%
☐ User manual insufficient
    (too short/verbose, methods not/poorly documented, lacking/outdated references, …)

    16%
☐ Only one design-variant provided (although alternatives exist)
      9%
☐ Methods based on simulations not reproducible (e.g., for reference-scaling)
      9%
☐ Operation is complicated
      8%
☐ Other (please specify)
      4%
  • Soothing that the majority is happy with the software.
    I’m asking myself: What have the 20% done (since, e.g., the partial replicate is not available in PASS)? Trust in the external one? If it’s commercial software I suggest to ask for an implementation (and cross fingers). The same holds for authors of free and open source software. In my experience they are more responsive than vendors of commercial software.
    IMHO, user manuals are a weak point of any software.
    PowerTOST’s simulation methods are reproducible since a fixed seed is issued by default (don’t change to setseed = FALSE). If you use your own code, make sure to use a fixed seed as well.
    I came across debates between sponsors and CROs about designs because e.g., PASS provides only three setups for replicate designs: ABBA|BAAB, AABB|BBAA|ABBA, and ABB|BAA. The second one should be avoided (FDA 2001, Appendix B 1). Since the design constants and degrees of freedom are identical, the first one covers all 4-period 2-sequence replicates (ABAB|BABA, ABBA|BAAB, and AABB|BBAA) and the third one both 3-period 2-sequence replicates (ABA|BAB and ABB|BAA).
    To be clear: Simulation-based methods for reference-scaling (HVD(P)s and NTIDs) are currently not implemented in any of the commercial packages, FARTSSIE (since v2.5 the code for PowerTOST is given), EFG, R-package bear.
    Beauty is in the eye of the beholder. It’s a matter of taste whether clicking through menus or providing arguments in the R-console is considered complicated.
[/list]

[list=1][*]Since the survey is not public (I’ve send out invitations by e-mail): Some participants answered only the first question and skipped all the others. That’s not helpful.
[*]Not like in “free beer” but like in “free speech”.[/list]

The IP is not recorded, only the country:
India (27), Russia (12), Germany (10), Spain (8), Czechia (7), Jordan (5), USA (5), Austria (2), China (2), Mexico (2), The Nether­lands (2), Po­land (2), Ukraine (2), Australia (1), Bela­rus (1), Brazil (1), Den­mark (1), Egypt (1), France (1), Greece (1), Italy (1), Por­tu­gal (1), Slo­venia (1), South Africa (1), Taiwan (1), Tanzania (1), Turkey (1), UK (1), Uruguay (1).

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
UA Flag
Activity
 Admin contact
22,957 posts in 4,819 threads, 1,636 registered users;
58 visitors (0 registered, 58 guests [including 11 identified bots]).
Forum time: 00:43 CET (Europe/Vienna)

Nothing shows a lack of mathematical education more
than an overly precise calculation.    Carl Friedrich Gauß

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5