Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-05-20 22:39

Posting: # 21455
Views: 2,204
 

 Sample Size Estimation in Bioequivalence 💯☑️ [Surveys]

Dear all,

on the occasion of recent discussions about software used for sample size estimation in BE I created a

Survey

with ten questions.

Attendance is anonymous and limited to one participation per device (i.e., same IP-address). If your institution has only one IP (like Novartis redirecting its 120,000+ employees to a single one in Basel…), sorry.
The survey should take about three minutes to complete.

The questions are:
  1. How do you estimate the sample size?
    ○ Sample size tables
    ○ Software
    ○ Both
  2. Which software do you use?
    ☐ Commercial (off-the-shelf, e.g., SAS Proc Plan, NQuery Advisor, PASS, StudySize, …)
    ☐ Open source (e.g., R-packages like PowerTOST, bear, …)
    ☐ Free (e.g., FARTSSIE, EFG, …)
    ☐ Web-based
    ☐ In-house (e.g., own SAS-macros, R, C, Excel-template, …)
    ☐ Optional: Please give the software you use most (incl. version, year of release)
      ┌──────────┐
      └──────────┘
  3. How often do you update the software you use most?
    ○ Never
    ○ Occasionally
    ○ Regularly
    ○ I don’t want to disclose this information
  4. Is the software you use most validated?
    ☐ IQ (Installation Qualification acc. to procedures provided by the vendor)
    ☐ OQ Type 1 (Operational Qualification acc. to procedures provided by the vendor)
    ☐ OQ Type 2 (Operational Qualification acc. to own pre-specified procedures)
    ☐ PQ (Performance Qualification)
    ☐ Comparison with sample size tables
    ☐ Cross-validated with other software
    ☐ Partly (i.e., only some of the procedures)
    ☐ No
    ☐ I don’t want to disclose this information
    ☐ Other approach (please specify)
  5. Were you ever asked by a regulatory agency about software validation?
    ☐ Yes
    ☐ No
    ☐ I don’t want to disclose this information
    ☐ Optional: If you answered “Yes”, please give the year
      ┌─────┐
      └─────┘
  6. Do you repeat the estimation in-house if provided by an external entity (CRO, sponsor, consultant)?
    ☐ Always
    ☐ Regularly
    ☐ Sometimes
    ☐ Never
  7. Do you perform a Sensitivity Analysis in order to assess the impact on power if in the study values (e.g., T/R-ratio, CV, number of dropouts) will deviate from assumptions?
    ○ Never
    ○ Sometimes
    ○ Always
    ○ I don’t know what a Sensitivity Analysis is
    ○ I don’t want to disclose this information
  8. Do you increase the estimated sample size according to the expected dropout rate?
    ○ Yes (formula: n’ = n × (100 + dropout-rate in %) / 100)
    ○ Yes (formula: n’ = n / (100 – dropout-rate in %) × 100)
    ○ Yes (as provided by the software; I don’t know the formula)
    ○ Yes (chosen by management)
    ○ No (since the impact on power is limited)
  9. Please give general problems that you faced in sample size estimation.
    ☐ Estimated sample size was substantially smaller/larger than expected
    (compared to PARs / other studies)

    ☐ Result of re-assessment differed from the estimate given (by CRO, sponsor, consultant)
    ☐ Software, version, setup not given (by CRO, sponsor, consultant)
    ☐ Other (please give a short description)
      ┌───────────────────────────────┐
      └───────────────────────────────┘
  10. Did you face problems with the software you use most?
    ☐ Planned design not available
    ☐ Only one design-variant provided (although alternatives exist)
    ☐ Methods based on simulations not reproducible (e.g., for reference-scaling)
    ☐ Operation is complicated
    ☐ User manual insufficient
    (too short/verbose, methods not/poorly documented, lacking/outdated references, …)

    ☐ No
    ☐ Other (please specify)
      ┌───────────────────────────────┐
      └───────────────────────────────┘
Feel free to participate. It could help all of us to understand the current landscape. If you close the browser in the middle of the survey, you can come back later – from the same device – to complete it.
If sample size estimation is not your pot of tea, consider inviting a responsible colleague.

I will post results mid June.

In the meantime I suggest [image] this one.


65 respondents as of 2020-06-05 23:20 CEST. Average time taken four minutes. THX!

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2020-06-03 12:36

@ Helmut
Posting: # 21497
Views: 357
 

 Preliminary evaluation

Dear all,

here are the results of 65 respondents as of 05 June. THX to all participants (the survey is still active). 1

Below the questions percentage of complete answers in decreasing order. Since in some questions multiple choices were possible, the percentage can be >100%. Not disclosed information excluded.
Below each question my very personal views on the outcome.
  1. How do you estimate the sample size?
    ○ Software
        69%
    ○ Both
        29%
    ○ Sample size tables
          2%

    • If one uses solely sample sizes tables, how to deal with not available combinations (T/R-ratio, CV, power)? Power is highly nonlinear and therefore, interpolation is difficult.

  2. Which software do you use?
    ☐ Open source (e.g., R-packages like PowerTOST, bear, …)
        59%
    ☐ Free (e.g., FARTSSIE, EFG, …)
        38%
    ☐ Commercial (off-the-shelf, e.g., SAS Proc Plan, NQuery Advisor, PASS, StudySize, …)
        31%
    ☐ In-house (e.g., own SAS-macros, R, C, Excel-template, …)
        17%
    ☐ Web-based
          5%
    ☐ Optional: Please give the software you use most (incl. version, year of release)
        PowerTOST (4), FARTSSIE (3), PASS (2), NQuery (1), SAS (1), SPSS (1)

      Open source and free software 2 is in the lead. Only because it comes at no cost? No commercial software provides sample sizes for the reference-scaling methods out of the box (though it is possible to write code in SAS or MatLab). One respondent simulated higher-order designs and incomplete block designs (>4 treatments), which are not available in any software.
      Would be interesting which web-based method is used. I know only ones for parallel groups based on the large sample approximation.

  3. How often do you update the software you use most?
    ○ Occasionally
        42%
    ○ Regularly
        41%
    ○ Never
        16%

    • It is not a good idea to never update the software (16%). Not only that bugs might be corrected, new methods might have been added in the meantime (note that in Q10 21% reported that a planned design is not available).
      Given the paranoia of IT departments, in some companies updating may be a cumbersome task. However, try to be not more than two releases behind the current one. The worst I ever have seen was a fifteen years old one…

  4. Is the software you use most validated?
    ☐ Comparison with sample size tables
        39%
    ☐ Cross-validated with other software
        34%
    ☐ No
        23%
    ☐ IQ (Installation Qualification acc. to procedures provided by the vendor)
        22%
    ☐ PQ (Performance Qualification)
        18%
    ☐ OQ Type 1 (Operational Qualification acc. to procedures provided by the vendor)
        15%
    ☐ Partly (i.e., only some of the procedures)
        13%
    ☐ OQ Type 2 (Operational Qualification acc. to own pre-specified procedures)
          5%
    ☐ Other approach (please specify)
          5%

    • Interesting that 23% of respondents reported that the software is not validated. Only fine if you are a regulator yourself. If you are a sponsor I would require that the CRO’s is. If you are with a CRO, I recommend to do something (at least compare with sample size tables). Some respondents answered that the software passed IQ and OQ Type 1. I would not trust in validation routines provided by the vendor alone (aka rubbish in, rubbish out).

  5. Were you ever asked by a regulatory agency about software validation?
    ☐ No
        88%
    ☐ Yes
          8%
    ☐ Optional: If you answered “Yes”, please give the year
        Two answers: 2018, 2019

    • Seems that regulators don’t care much. IMHO, regulators and members of IECs should (in assessing the protocol). According to ICH E9 “The number of subjects in a clinical trial should always be large enough to provide a reliable answer to the questions addressed.” Seemingly some – flawed – software routines give a higher than required sample size. That’s nice for the applicant but ethically questionable.

  6. Do you repeat the estimation in-house if provided by an external entity (CRO, sponsor, consultant)?
    ☐ Always
        60%
    ☐ Regularly
        23%
    ☐ Sometimes
        15%
    ☐ Never
          2%

    • Trust is good, control is better (Russian proverb).
      I suggest to always repeat the estimation. Takes a minute and prevents surprises later.

  7. Do you perform a Sensitivity Analysis in order to assess the impact on power if in the study values (e.g., T/R-ratio, CV, number of dropouts) will deviate from assumptions?
    ○ Always
        37%
    ○ Sometimes
        37%
    ○ Never
        15%
    ○ I don’t know what a Sensitivity Analysis is
          5%

    • A sensitivity analysis is recommended by ICH E9 (Section 3.5) and E9(R1). The 16% reporting to never perform one are possibly believers of the “carved in stone” approach (i.e., that the assumed values are true ones and will be exactly realized in the study). That’s extremely risky, esp. if the T/R-ratio will turn out to be worse than assumed. The impact on power is massive. Hence, I suggest to perform always a sensitivity analysis (as 37% already do).
      If you use PowerTOST, I recommend its functions pa.ABE() and pa.scABEL() as a starter. Examples are given in the vignette.

  8. Do you increase the estimated sample size according to the expected dropout rate?
    ○ Yes (formula: n’ = n × (100 + dropout-rate in %) / 100)
        32%
    ○ Yes (chosen by management)
        31%
    ○ Yes (formula: n’ = n / (100 – dropout-rate in %) × 100)
        25%
    ○ Yes (as provided by the software; I don’t know the formula)
          8%
    ○ No (since the impact on power is limited)
          3%

    • Sadly one third fell into the trap and used the wrong formula to adjust the sample size (if one faces exactly the anticipated dropout-rate and all other assumptions are correct, the number of eligible subjects can be too low). The dropout-rate is based on the dosed subjects and hence, 25% used the correct formula (see also this post).
      Bad that the management increases the sample size (based on what: gut-feeling, reading tea leaves, budget?). FARTSSIE’s Boss Button is not a good idea.
      For the 8% who don’t know what the software does: At least in PASS the correct formula is implemented.

  9. Please give general problems that you faced in sample size estimation.
    ☐ Result of re-assessment differed from the estimate given (by CRO, sponsor, consultant)
        43%
    ☐ Estimated sample size was substantially smaller/larger than expected
         (compared to PARs / other studies)

        37%
    ☐ Software, version, setup not given (by CRO, sponsor, consultant)
        31%
    ☐ Other (please give a short description)
        12%

    • Amazing that the sample size estimation differed from the external one in so many cases. If information is not given, one is fishing in the dark (which one is correct?). Ask for it.
      Note that differences to PARs are not uncommon and to other studies (if not your own) as well. The clinical setup might differ as does the bioanalytical method. Another good reason to perform a sensitivity analysis. Perhaps own assumptions were too optimistic?
      Sometimes own estimations give a much larger sample size than the study of the PAR had. It might well be that it passed by sheer luck. Always check the power of the study.

  10. Did you face problems with the software you use most?
    ☐ No
        59%
    ☐ Planned design not available
        21%
    ☐ User manual insufficient
        (too short/verbose, methods not/poorly documented, lacking/outdated references, …)

        14%
    ☐ Methods based on simulations not reproducible (e.g., for reference-scaling)
        11%
    ☐ Only one design-variant provided (although alternatives exist)
        11%
    ☐ Operation is complicated
          9%
    ☐ Other (please specify)
          5%

    • Soothing that the majority is happy with the software.
      I’m asking myself: What have the 20% done (since, e.g., the partial replicate is not available in PASS)? Trust in the external one? If it’s commercial software I suggest to ask for an implementation (and cross fingers). Same holds for authors of free and open source software. In my experience they are more responsive.
      IMHO, user manuals are a weak point of any software.
      PowerTOST’s simulation methods are reproducible since a fixed seed is issued by default (don’t change to setseed = FALSE). If you use your own code, make sure to use a fixed seed as well.
      I came across debates between sponsors and CROs about designs because e.g., PASS provides only three setups for replicate designs: ABBA|BAAB, AABB|BBAA|ABBA, BAAB, and ABB|BAA. The second one – and ABAB|BABA|ABBA|BAAB as well – should be avoided (FDA 2001, Appendix B 1). Since the design constants and degrees of freedom are identical, the first one covers all 4-period 2-sequence replicates (ABAB|BABA, ABBA|BAAB, and AABB|BBAA) and the third one both 3-period 2-sequence replicates (ABA|BAB and ABB|BAA).
      To be clear: Simulation-based methods for reference-scaling are currently not implemented in any of the commercial packages, FARTSSIE, EFG, R-package bear.
      Beauty is in the eye of the beholder. It is a matter of taste whether clicking through menus or providing arguments in the R-console is considered complicated.


  1. Since the survey is not public (the category is not indexed by search engines) and I’ve send out invitations by e-mail: In the last week some people answered only the first question and skipped all the others. That’s not helpful.
  2. Not like in “free beer” but like in “free speech”.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Activity
 Admin contact
20,648 posts in 4,328 threads, 1,436 registered users;
online 70 (1 registered, 69 guests [including 9 identified bots]).
Forum time: 11:36 CEST (Europe/Vienna)

Freedom is always and exclusively
freedom for the one
who thinks differently.    Rosa Luxemburg

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5