Sims w.o. intermediate power [Two-Stage / GS Designs]

posted by Helmut Homepage – Vienna, Austria, 2012-07-26 18:20 (4288 d 14:11 ago) – Posting: # 8983
Views: 19,317

Dear Detlew & all!

❝ I've heard some rumour from regulators that you should not stop but recruit 2 more subjects for stage 2 in that case. But this lacks any scientific justification IMHO.


SCNR. Implemented your “rumour scheme”. In contrast to Potvin B, where a study might already fail in stage 1 (not BE at α 0.0294 and power ≥80%) we must continue to stage 2. Oh, only a few more subjects – presumably not an ethical concern to (some?) regulators… :not really:

[image]

Example 1: CV 20%, T/R 0.95, α 0.0294, expected power with n1 24 is 83.6%. Sample size for a fixed design, 80% power, α 0.05 would be 20. Run on two machines (R 2.15.1, PowerTOST 0.9-10), 106 simulations each.
Potvin B
            Ratio 1.25               │             Ratio 0.95             
─────────────────────────────────────┼─────────────────────────────────────
                      % in   empiric │                       % in   empiric
  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β 
26.0  24   24   34    34.5   0.0320  │ 24.6  24   24   28     8.6    0.8810

“rumour scheme”
            Ratio 1.25               │             Ratio 0.95             
─────────────────────────────────────┼─────────────────────────────────────
                      % in   empiric │                       % in   empiric
  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β 
27.2  26   26   34    97.1   0.0379  │ 24.7  24   24   28    16.3    0.9037
27.2  26   26   34    97.1   0.0379  │ 24.7  24   24   28    16.4    0.9030


[image]


Example 2: CV 30%, T/R 0.95, α 0.0294, expected power with n1 24 is 41.1%. Sample size for a fixed design, 80% power, α 0.05 would be 40. Sims as above.
Potvin B
            Ratio 1.25               │             Ratio 0.95             
─────────────────────────────────────┼─────────────────────────────────────
                      % in   empiric │                       % in   empiric
  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β 
46.9  24   46   72    95.0   0.0475  │ 39.9  24   38   70    58.3    0.8305

“rumour scheme”
            Ratio 1.25               │             Ratio 0.95             
─────────────────────────────────────┼─────────────────────────────────────
                      % in   empiric │                       % in   empiric
  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β 
46.8  26   46   72    97.1   0.0480  │ 39.8  24   36   70    58.8    0.8303
46.8  26   46   70    97.1   0.0482  │ 39.9  24   36   70    58.9    0.8303


[image]

Good news: In both examples the patient’s risk is preserved. But note that empiric α is slightly lower for Method B as compared to the ‘powerless method’.1 In other words, EMA’s ‘method’ is liberal. In the first example twice as much studies are send to the second stage (caused by the ‘n2=n1+2 rule’). Studies which would fail already in stage 1 according to Method B are forced to the second stage. :angry:

Differences in the second example are less pronounced, because n1 is likely too small to claim BE already in the first stage.

Lesson learned: I always thought of Two-Stage designs to offer an opportunity to ‘save’ a study if assumptions about the variance turn out to be incorrect. I was never a friend of playing chances and go with an interim analysis at a sample size where presumable you would fail and have to proceed to the second stage anyway.* Doubles the study time and even if your assumptions were correct you have to pay the penalty in the total sample size (10–20% more subjects as compared to a fixed design). Since exhaustive simulations of the ‘rumour scheme’ are not published (has EMA performed some at all or is this just a ‘gut-feeling-approach’?), IMHO it would require simulations for every single study [sic] to demonstrate that αemp. ≤0.05.2

P.S.: Thanks to Detlew for finding a bug in my code. ;-)


  1. Instead of ‘powerless’ can we introduce the term ‘weak’?
  2. For 106 simulations the one-sided level of significance based on the exact binominal test is 0.05035995. Only an αemp. larger than the critical value is significantly >0.05. Also mentioned in the note below Table I of Potvin’s paper. If you are patient and opt for 107 sims: 0.05011351… For R-freaks:
    sims <- 1E6
    x    <- 0.05
    binom.test(x*sims, sims, alternative='less')


Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
Activity
 Admin contact
22,990 posts in 4,826 threads, 1,664 registered users;
52 visitors (0 registered, 52 guests [including 7 identified bots]).
Forum time: 08:31 CEST (Europe/Vienna)

If you don’t like something change it;
if you can’t change it, change the way you think about it.    Mary Engelbreit

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5