Bioequivalence and Bioavailability Forum • Sims w.o. intermediate power

Sims w.o. intermediate power [Two-Stage / GS Designs]

posted by Helmut – Vienna, Austria, 2012-07-26 18:20 (5085 d 08:34 ago) – Posting: # 8983
Views: 24,145

Dear Detlew & all!

❝ I've heard some rumour from regulators that you should not stop but recruit 2 more subjects for stage 2 in that case. But this lacks any scientific justification IMHO.

SCNR. Implemented your “rumour scheme”. In contrast to Potvin B, where a study might already fail in stage 1 (not BE at α 0.0294 and power ≥80%) we must continue to stage 2. Oh, only a few more subjects – presumably not an ethical concern to (some?) regulators… :not really:

Example 1: CV 20%, T/R 0.95, α 0.0294, expected power with n₁ 24 is 83.6%. Sample size for a fixed design, 80% power, α 0.05 would be 20. Run on two machines (R 2.15.1, PowerTOST 0.9-10), 10⁶ simulations each.
Potvin B

            Ratio 1.25               │             Ratio 0.95              

─────────────────────────────────────┼─────────────────────────────────────

                      % in   empiric │                       % in   empiric

  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β  

26.0  24   24   34    34.5   0.0320  │ 24.6  24   24   28     8.6    0.8810

“rumour scheme”

            Ratio 1.25               │             Ratio 0.95              

─────────────────────────────────────┼─────────────────────────────────────

                      % in   empiric │                       % in   empiric

  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β  

27.2  26   26   34    97.1   0.0379  │ 24.7  24   24   28    16.3    0.9037

27.2  26   26   34    97.1   0.0379  │ 24.7  24   24   28    16.4    0.9030

Example 2: CV 30%, T/R 0.95, α 0.0294, expected power with n₁ 24 is 41.1%. Sample size for a fixed design, 80% power, α 0.05 would be 40. Sims as above.
Potvin B

            Ratio 1.25               │             Ratio 0.95              

─────────────────────────────────────┼─────────────────────────────────────

                      % in   empiric │                       % in   empiric

  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β  

46.9  24   46   72    95.0   0.0475  │ 39.9  24   38   70    58.3    0.8305

“rumour scheme”

            Ratio 1.25               │             Ratio 0.95              

─────────────────────────────────────┼─────────────────────────────────────

                      % in   empiric │                       % in   empiric

  n  ( 5%, 50%, 95%) stage 2    α    │   n  ( 5%, 50%, 95%) stage 2   1-β  

46.8  26   46   72    97.1   0.0480  │ 39.8  24   36   70    58.8    0.8303

46.8  26   46   70    97.1   0.0482  │ 39.9  24   36   70    58.9    0.8303

Good news: In both examples the patient’s risk is preserved. But note that empiric α is slightly lower for Method B as compared to the ‘powerless method’.¹ In other words, EMA’s ‘method’ is liberal. In the first example twice as much studies are send to the second stage (caused by the ‘n₂=n₁+2 rule’). Studies which would fail already in stage 1 according to Method B are forced to the second stage. :angry:

Differences in the second example are less pronounced, because n₁ is likely too small to claim BE already in the first stage.

Lesson learned: I always thought of Two-Stage designs to offer an opportunity to ‘save’ a study if assumptions about the variance turn out to be incorrect. I was never a friend of playing chances and go with an interim analysis at a sample size where presumable you would fail and have to proceed to the second stage anyway.* Doubles the study time and even if your assumptions were correct you have to pay the penalty in the total sample size (10–20% more subjects as compared to a fixed design). Since exhaustive simulations of the ‘rumour scheme’ are not published (has EMA performed some at all or is this just a ‘gut-feeling-approach’?), IMHO it would require simulations for every single study [sic] to demonstrate that α_emp. ≤0.05.²

P.S.: Thanks to Detlew for finding a bug in my code. ;-)

Instead of ‘powerless’ can we introduce the term ‘weak’?
For 10⁶ simulations the one-sided level of significance based on the exact binominal test is 0.05035995. Only an α_emp. larger than the critical value is significantly >0.05. Also mentioned in the note below Table I of Potvin’s paper. If you are patient and opt for 10⁷ sims: 0.05011351… For R-freaks:
sims <- 1E6 x <- 0.05 binom.test(x*sims, sims, alternative='less')

For similar resons I fail to understand the application of O’Brien/Fleming in BE.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Sequential designs, draft FDA guidance Loteprednol ElMaestro 2012-07-03 18:33
- What the heck? Helmut 2012-07-03 19:25
  - some unofficial opinion with this regard in Europe Shuanghe 2012-07-04 09:07
    - Great! Helmut 2012-07-04 15:43
      - Great! Shuanghe 2012-07-05 13:48
        
        Potvin & Montague not acceptable at all?! Helmut 2012-07-05 14:57
        
        Powerless Potvin & Montague? d_labes 2012-07-05 16:11
        
        Example w.o. intermediate power Helmut 2012-07-06 02:06
        
        Example w.o. intermediate power d_labes 2012-07-06 08:13
        
        Sims w.o. intermediate powerHelmut 2012-07-26 16:20
        
        Potvin & Montague not acceptable at all?! ElMaestro 2012-07-05 16:20
        
        Sims? Helmut 2012-07-05 16:36