Sims w.o. intermediate power [Two-Stage / GS Designs]

posted by Helmut Homepage – Vienna, Austria, 2012-07-26 18:20 (5085 d 08:34 ago) – Posting: # 8983
Views: 24,145

Dear Detlew & all!

❝ I've heard some rumour from regulators that you should not stop but recruit 2 more subjects for stage 2 in that case. But this lacks any scientific justification IMHO.


SCNR. Implemented your “rumour scheme”. In contrast to Potvin B, where a study might already fail in stage 1 (not BE at α 0.0294 and power ≥80%) we must continue to stage 2. Oh, only a few more subjects – presumably not an ethical concern to (some?) regulators… :not really:

[image]

Example 1: CV 20%, T/R 0.95, α 0.0294, expected power with n1 24 is 83.6%. Sample size for a fixed design, 80% power, α 0.05 would be 20. Run on two machines (R 2.15.1, PowerTOST 0.9-10), 106 simulations each.
Potvin B
            Ratio 1.25               │             Ratio 0.95             
─────────────────────────────────────┼─────────────────────────────────────
                      % in   empiric │                       % in   empiric
  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β 
26.0  24   24   34    34.5   0.0320  │ 24.6  24   24   28     8.6    0.8810

“rumour scheme”
            Ratio 1.25               │             Ratio 0.95             
─────────────────────────────────────┼─────────────────────────────────────
                      % in   empiric │                       % in   empiric
  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β 
27.2  26   26   34    97.1   0.0379  │ 24.7  24   24   28    16.3    0.9037
27.2  26   26   34    97.1   0.0379  │ 24.7  24   24   28    16.4    0.9030


[image]


Example 2: CV 30%, T/R 0.95, α 0.0294, expected power with n1 24 is 41.1%. Sample size for a fixed design, 80% power, α 0.05 would be 40. Sims as above.
Potvin B
            Ratio 1.25               │             Ratio 0.95             
─────────────────────────────────────┼─────────────────────────────────────
                      % in   empiric │                       % in   empiric
  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β 
46.9  24   46   72    95.0   0.0475  │ 39.9  24   38   70    58.3    0.8305

“rumour scheme”
            Ratio 1.25               │             Ratio 0.95             
─────────────────────────────────────┼─────────────────────────────────────
                      % in   empiric │                       % in   empiric
  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β 
46.8  26   46   72    97.1   0.0480  │ 39.8  24   36   70    58.8    0.8303
46.8  26   46   70    97.1   0.0482  │ 39.9  24   36   70    58.9    0.8303


[image]

Good news: In both examples the patient’s risk is preserved. But note that empiric α is slightly lower for Method B as compared to the ‘powerless method’.1 In other words, EMA’s ‘method’ is liberal. In the first example twice as much studies are send to the second stage (caused by the ‘n2=n1+2 rule’). Studies which would fail already in stage 1 according to Method B are forced to the second stage. :angry:

Differences in the second example are less pronounced, because n1 is likely too small to claim BE already in the first stage.

Lesson learned: I always thought of Two-Stage designs to offer an opportunity to ‘save’ a study if assumptions about the variance turn out to be incorrect. I was never a friend of playing chances and go with an interim analysis at a sample size where presumable you would fail and have to proceed to the second stage anyway.* Doubles the study time and even if your assumptions were correct you have to pay the penalty in the total sample size (10–20% more subjects as compared to a fixed design). Since exhaustive simulations of the ‘rumour scheme’ are not published (has EMA performed some at all or is this just a ‘gut-feeling-approach’?), IMHO it would require simulations for every single study [sic] to demonstrate that αemp. ≤0.05.2

P.S.: Thanks to Detlew for finding a bug in my code. ;-)


  1. Instead of ‘powerless’ can we introduce the term ‘weak’?
  2. For 106 simulations the one-sided level of significance based on the exact binominal test is 0.05035995. Only an αemp. larger than the critical value is significantly >0.05. Also mentioned in the note below Table I of Potvin’s paper. If you are patient and opt for 107 sims: 0.05011351… For R-freaks:
    sims <- 1E6
    x    <- 0.05
    binom.test(x*sims, sims, alternative='less')


Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
Activity
 Admin contact
23,655 posts in 4,993 threads, 1,570 registered users;
280 visitors (0 registered, 280 guests [including 14 identified bots]).
Forum time: 02:54 CEST (Europe/Vienna)

Most scientists today are devoid of ideas, full of fear, intent on
producing some paltry result so that they can add to the flood
of inane papers that now constitutes “scientific progress”
in many areas.    Paul Feyerabend

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5