Helmut
★★★

Vienna, Austria,
2014-05-31 18:20

Posting: # 13024
Views: 11,078

## How to find a suit­able adjusted α? [Two-Stage / GS Designs]

Dear all,

have you ever wondered how to find a suitable adjusted α if the desired combination of target power, expected T/R-ratio (and even the acceptance range) is not given in any of the many publications? In such a case we have to validate the framework within the desired range of n1/CV-combinations in order to demonstrate that the overall type I error is preserved.
Here is my recipe – inspired by Fuglsang’s1 ideas:
1. Explore the desired grid of n1/CV-combinations.
Use Pocock’s α 0.0294 as a starter; run 105 simulations for speed reasons.
Find the location of maximum inflation. For variants of “Method C” the maximum inflation is generally seen at the minimum n1 and CV ~10–25%; for variants of “Method B” it can be anywhere…
2. At this location simulate a vector of adjusted alphas (106 sim’s). Lower limit 0.025 (Bonferroni), upper limit αmax/2+10% (a feasible guess in most cases).
Perform a linear regression of type I errors vs. αadj (THX to Anders!); the intersection of the regression line with 0.05 is the “best” αadj.
3. Validate the grid with the chosen αadj. I suggest a narrower grid of n1/CV-combinations (at least around the planned n1 and expected CV). 106 sim’s per grid-point are mandatory.
In R by means of package Power2Stage:

library(Power2Stage) ############## # First step # ############## n1.min  <- 12 n1.max  <- 60 n1.step <- 12 CV.min  <- 0.1 CV.max  <- 0.8 CV.step <- 0.05 method  <- "B" GMR     <- 0.95 # assumed T/R-ratio theta0  <- 1.25 # upper BE-limit for simulating alpha target  <- 0.80 # target power (generally 0.8–0.9) usePE   <- FALSE alpha   <- c(0.0294, 0.0294)            # Pocock's alpha.            # Change only if you know what you are doing! Nmax    <- Inf            # No upper sample size limit (futility).            # Change only if you know what you are doing!            # Generally alpha is not affected, but have a            # close look at power! # Potvin's defaults: n1 12-60 (step 12), CV 0.1-1.0 (step 0.1), GMR 0.95, target 0.8 n1 <- seq(n1.min, n1.max, by=n1.step) CV <- seq(CV.min, CV.max, by=CV.step) nsims <- 1e5 pBE <- NULL for(j in 1:length(CV)){   for(k in 1:length(n1)){     pBE <- c(pBE, power.2stage(method=method, alpha=alpha,             n1=n1[k], GMR=GMR, CV=CV[j], targetpower=target,             pmethod="nct", usePE=usePE, Nmax=Nmax, theta0=theta0,             nsims=nsims, print=F, details=F)$pBE) } } grid <- matrix(data=pBE, nrow=length(CV), ncol=length(n1), byrow=T, dimnames=list(CV, n1)) pos <- as.numeric(which(grid == max(grid), arr.ind=TRUE)) CV.pos <- as.numeric(rownames(grid)[pos[1]]) n1.pos <- as.numeric(colnames(grid)[pos[2]]) alpha.max <- grid[pos[1], pos[2]] grid cat(paste0("\nFor adj. \u03b1 of ", alpha[1], " a maximum inflation of ", alpha.max, " is detected at CV=", CV.pos*100, "% and n1=", n1.pos, ".\n")) LL <- 0.025 # Bonferroni (most conservative) UL <- alpha.max/2*1.1 # guess! if(UL <= alpha[1]) UL <- alpha[1]*1.1 # modification for small inflation ############### # Second step # ############### nsims <- 1e6 x <- seq(LL, UL, length.out=10) pBE <- NULL for(j in 1:length(x)){ pBE <- c(pBE, power.2stage(method=method, alpha=rep(x[j], 2), n1=n1.pos, GMR=GMR, CV=CV.pos, targetpower=target, pmethod="nct", usePE=usePE, Nmax=Nmax, theta0=theta0, nsims=nsims, print=F, details=F)$pBE) } mod <- lm(pBE ~ x) alpha.ad <- rep(round(as.numeric((0.05-coef(mod)[1])/coef(mod)[2]), 4), 2) sig <- binom.test(0.05*nsims, nsims, alternative="less")$conf.int[2] plot(x, pBE, xlab="adj. alpha", ylab="emp. alpha", main=paste0("“Method ", method, "”, Inflation at CV = ", CV.pos*100, "% and n1 = ", n1.pos)) abline(h=0.05, col="red") # target alpha abline(h=sig, lty="dotted", col="red") # significance limit abline(mod) # regression line abline(v=alpha[1], lty="dotted") # first guess (e.g., 0.0294) abline(v=alpha.ad[1], col="blue") # estimated adjusted alpha cat("estimated adjusted \u03b1:", alpha.ad[1], "\n") ############## # Third step # ############## pBE <- NULL for(j in 1:length(CV)){ for(k in 1:length(n1)){ pBE <- c(pBE, power.2stage(method=method, alpha=alpha.ad, n1=n1[k], GMR=GMR, CV=CV[j], targetpower=target, pmethod="nct", usePE=usePE, Nmax=Nmax, theta0=theta0, nsims=nsims, print=F, details=F)$pBE)   } } grid <- matrix(data=pBE, nrow=length(CV), ncol=length(n1), byrow=T,   dimnames=list(CV, n1)) pos <- as.numeric(which(grid == max(grid), arr.ind=TRUE)) CV.pos <- as.numeric(rownames(grid)[pos[1]]) n1.pos <- as.numeric(colnames(grid)[pos[2]]) alpha.max <- grid[pos[1], pos[2]] cat(paste0("\nValidation of chosen adjusted \u03b1 (", alpha.ad[1], "):\n",   "\u201cMethod ", method, "\u201d, assumed T/R ", GMR*100, "%, ",   "target power ", target*100, "%\n")) grid cat(paste0("\nWith an adj. \u03b1 of ", alpha.ad[1], " a maximum inflation of ",   alpha.max, " is seen at CV=", CV.pos*100, "% and n1=", n1.pos, ".\n")) if(alpha.max <= sig){   cat(paste0("\nSignificance limit for ", nsims, " simulations: ",   round(sig, 5), "; no significant inflation seen.",   "\nThe chosen adjusted \u03b1 is justified within the assessed range of ",   "stage 1 sample sizes (", n1.min, "\u002d", n1.max,") and CVs (", CV.min*100,   "\u002d", CV.max*100, "%).\n")) }else{   cat(paste0("\nSignificance limit for ", nsims, " simulations: ",   round(sig, 5), "; significant inflation seen.\n"))   if(alpha.max < 0.052){     cat(paste0("However, the maximum inflation of ", alpha.max,     " does not exceed Potvin\u2019s \u201cnegligible inflation\u201d of 0.052. ",     "Therefore, the chosen adjusted \u03b1 is justified within the assessed range of ",     "stage 1 sample sizes (", n1.min, "\u002d", n1.max,") and CVs (", CV.min*100,     "\u002d", CV.max*100, "%).\n"))   }else{     cat(paste("The maximum inflation of", alpha.max,     "exceeds Potvin\u2019s \u201cnegligible inflation\u201d of 0.052.",     "Therefore, a smaller adjusted \u03b1 should be tried!\n"))   } }

I got:
          12      24      36      48      60 0.1  0.03995 0.02985 0.02976 0.02978 0.02988 0.15 0.05234 0.04226 0.03234 0.02987 0.02988 0.2  0.05408 0.05068 0.04561 0.03912 0.03279 0.25 0.05011 0.05369 0.05047 0.04814 0.04406 0.3  0.04262 0.05398 0.05317 0.05117 0.04947 0.35 0.03668 0.05042 0.05272 0.05364 0.05152 0.4  0.03228 0.04438 0.05266 0.05389 0.05275 0.45 0.03112 0.03727 0.04858 0.05410 0.05367 0.5  0.02959 0.03287 0.04243 0.05190 0.05319 0.55 0.02940 0.03146 0.03642 0.04705 0.05209 0.6  0.02901 0.03020 0.03251 0.04075 0.04876 0.65 0.02888 0.03017 0.03036 0.03586 0.04347 0.7  0.02849 0.03006 0.02983 0.03249 0.03820 0.75 0.02839 0.02964 0.02867 0.03095 0.03348 0.8  0.02847 0.03000 0.02953 0.03005 0.03117 For adj. α of 0.0294 a maximum inflation of 0.0541 is detected at CV=45% and n1=48.

estimated adjusted α: 0.0274

           12       24       36       48       60 0.1  0.037296 0.027386 0.027369 0.027309 0.027285 0.15 0.049151 0.039538 0.030016 0.027398 0.027285 0.2  0.051255 0.047733 0.042950 0.036368 0.030491 0.25 0.047249 0.049676 0.047458 0.044682 0.041669 0.3  0.040369 0.050304 0.049422 0.047641 0.045841 0.35 0.034418 0.047104 0.050399 0.049285 0.048116 0.4  0.031017 0.040444 0.049617 0.049995 0.049238 0.45 0.029327 0.034318 0.045297 0.049996 0.050201 0.5  0.028601 0.030571 0.039118 0.047798 0.050117 0.55 0.028188 0.029017 0.033528 0.042823 0.048769 0.6  0.027837 0.028173 0.030263 0.036796 0.045379 0.65 0.027771 0.027985 0.028479 0.032260 0.039910 0.7  0.027626 0.027742 0.027986 0.029644 0.034807 0.75 0.027706 0.027803 0.027488 0.028406 0.031237 0.8  0.027586 0.027720 0.027595 0.027846 0.028965 With an adj. α of 0.0274 a maximum inflation of 0.051255 is seen at CV=20% and n1=12. Significance limit for 1e+06 simulations: 0.05036; significant inflation seen. However, the maximum inflation of 0.051255 does not exceed Potvin’s “negligible inflation” of 0.052. Therefore, the chosen adjusted α is justified within the assessed range of stage 1 sample sizes (12-60) and CVs (10-80%).

Get a decent cup of coffee – it takes a while (on my machine 11 min for Step 1 and 10 min for Step 2) Depending on the chosen grid expect up to a couple of daysa for Step 3. The simple grid of 75·106 sim’s took six hours to complete.
For “Method B”, T/R-ratio 0.9, 90% power I got 0.0274. Slightly larger than the 0.0269 Fuglsang reported2 for “Method C”. Not surprising, since B is always more conservative than C. In other words a slightly larger α is expected to lead to a similar inflation.
BTW, for “Method C” I got 0.0268 (10 & 17 min). IMHO, a nice agreementb (different software: C vs. R, different seeds of the pseudo-random generator, different power-methods: shifted t vs. noncentral t).

Don’t forget the third step – regulators want to see only that (EMA: “appropriate steps must be taken to preserve the overall type I error of the experiment” and “the choice of how much alpha to spend at the interim analysis is at the company’s discretion”). If you want to introduce a futility criterion (e.g., an upper total sample size or even fiddle with usePE=TRUE), simulating power is crucial in order to avoid a nasty surprise.

1. Fuglsang A. Controlling type I errors for two-stage bioequivalence study designs. Clin Res Regul Aff. 2011;28(4):100–5. doi:10.3109/10601333.2011.631547
2. Fuglsang A. Sequential Bioequivalence Trial Designs with Increased Power and Controlled Type I Error Rates. AAPS J. 2013;15(3):659–61. doi:10.1208/s12248-013-9475-5

1. Hint: Split the grid and run more than one R-session simultaneously. Two sessions on my machine suck up only 25% of the CPU’s resources.

2. A comparison of “Method C”, 40·106 sim’s. HB = my homebrew (0.0268), AF = Anders Fuglsang (0.0269):
           12             24             36             48             60      CV     HB     AF      HB     AF      HB     AF      HB     AF      HB     AF  0.1  0.0488 0.0484  0.0496 0.0498  0.0499 0.0498  0.0498 0.0500  0.0497 0.0499 0.2  0.0500 0.0501  0.0468 0.0472  0.0444 0.0445  0.0456 0.0459  0.0487 0.0486 0.3  0.0392 0.0394  0.0492 0.0496  0.0485 0.0486  0.0468 0.0474  0.0451 0.0452 0.4  0.0304 0.0307  0.0394 0.0399  0.0482 0.0486  0.0491 0.0494  0.0484 0.0485 0.5  0.0279 0.0282  0.0301 0.0296  0.0379 0.0382  0.0466 0.0469  0.0489 0.0492 0.6  0.0272 0.0273  0.0276 0.0275  0.0293 0.0296  0.0357 0.0362  0.0441 0.0442 0.7  0.0271 0.0271  0.0271 0.0273  0.0273 0.0272  0.0287 0.0290  0.0339 0.0339 0.8  0.0269 0.0267  0.0269 0.0272  0.0268 0.0272  0.0271 0.0273  0.0283 0.0284
31 (78%) smaller, 2 (5%) identical, 7 (18%) larger.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2014-06-02 08:28

@ Helmut
Posting: # 13025
Views: 9,083

## Suitable code for suit­able adjusted α

Dear Helmut!

Thanks for sharing this code!

Just one minor comment or better question:
How is your experience in regulatory acceptance of Potvin's 'acceptable' alpha-inflation of 0.052? If I remember correctly there was some rumour that even a smaller value of the empirical alpha has to be seen as inflation. That would mean that we had to down weight the adj. alpha to some extent.

Regards,

Detlew
Helmut
★★★

Vienna, Austria,
2014-06-02 14:44

@ d_labes
Posting: # 13026
Views: 9,291

## Handling inflation

Dear Detlew!

» Thanks for sharing this code!

My pleasure! I think that we still need some improvements:
1. In my example the largest inflation was seen at 48/0.45 with 0.05410 and the second largest at 12/0.20 with 0.05408. However, in step 3 the largest inflation showed up at 12/0.20. It might well be that the result of step 1 was an artifact due to only 105 sim’s performed. With 106 I got for 12/0.20 0.054605 () and for 48/0.45 0.05374 (), confirming the location of the maximum picked by my code was due to chance.
2. In most methods (except Karalis/Macheras) the “ridge” of the “type I error-surface” follows rough­ly the diagonal of the grid – from low n1/low CV to high n1/high CV (see the contour-plots of Potvin’s Methods B and C to the right).
Maybe it would make sense to split the first step:
1. Get preliminary estimates for all combos obtained from 105 sim’s like before.*
2. Repeat for the largest inflation observed in each column (i.e., the n1); this time with 106 sim’s. Should be closer to the “true” maximum and avoid the trap mentioned above.
3. If I repeat the second step at 12/0.20 I get an adjusted α of 0.0268 – which is substantially smaller than 0.0274 (and identical to what I got for “Method C”, which is counter­intu­itive, IMHO).

Overall α (step 3) for this combo decreases from 0.051255 to 0.049795… If we follow this track I expect lower alphas than in any of the published papers. Does that make sense?

» How is your experience in regulatory acceptance of Potvin's 'acceptable' alpha-inflation of 0.052?

Mixed. Some European (!) regulators don’t like (‼) TSDs at all, but accept them if following “Me­thod B” (quote: “according to the guideline…”). In one case Austria’s AGES asked for a pos­te­ri­ori confirmation of “lacking inflation” of “Method C” based on the actual sample size and CV in the study (was 0.0494 with a 95% CI of 0.0490–0.0498). Duno what might have happened if 0.05 <α ≤0.052. According to Chinese whis­pers ≤0.051 is considered acceptable. Why? Duno. The maxi­mum inflation in Potvin’s paper for “Method B” is 0.0485 and for “Method C” 0.051. Maybe someone read the wrong column.

» If I remember correctly there was some rumour that even a smaller value of the empirical alpha has to be seen as inflation.

If I recall correctly that’s the personal opinion of the Austrian member or EMA’s Biostatistics Working Party. Recently a member of the PKWP told me how he made peace with TSDs – after years of lurking doubt: “The in­fla­tion would be relevant only if the CI in the study covers exactly 80–125%. Since in real life the CI is narrower, the actual patient’s risk – even if there would be a small inflation due to the method – likely is ≪5%. So I don’t bother any more.” Pragmatic approach.

» That would mean that we had to down weight the adj. alpha to some extent.

By throwing away all published papers and increasing the downloads of Power2Stage? Of course PQRI’s Sequential Design Working Group’s “negligible inflation” (≤0.052) is arbitrary – as are many other rules we have to observe in BE. BTW, only Montague’s “Method D” scratches 0.052. Results of the publications:

                    αadj     αmax  sign. >0.05? Potvin 2008 B     0.0294  0.0485    Potvin 2008 C     0.0294  0.0510   yes Montague 2011 D   0.0280  0.0518   yes Fuglsang 2013 B   0.0284  0.0501   no Fuglsang 2013 D1  0.0274  0.0503   no Fuglsang 2013 D2  0.0269  0.0501   no Fuglsang 2014 B   0.0294  0.0478   no Fuglsang 2014 C   0.0294  0.0506   yes

Unless the EMA makes a clear statement about their preferred maximum inflation (or at least some­one receives a deficiency letter and hopefully reports here) I would not change anything.

• Running my original code with the setup of Potvin’s paper I “detect” a maximum inflation of 0.04882 at 36/0.40 (B) and 0.0513 at 12/0.20 (C) in the first step.
Good news: These locations agree with the ones reported.
Bad news: My guess of the upper limit for the regression is crap if there is only a small inflation with the α employed in the first step. I suggest this modi­fi­cation:
UL <- alpha.max/2*1.1 if(UL <= alpha[1]) UL <- alpha[1]*1.1
I got:

Anders’ algo suggests 0.0304 for B and 0.0282 for C. Interesting. Validation under way.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Helmut
★★★

Vienna, Austria,
2014-06-02 22:07

@ d_labes
Posting: # 13028
Views: 9,292

## Potvin revis(it)ed

Dear Detlew!

Below the results of my simulations (I used α 0.0304 for “Method B” and 0.0282 for “Method C”). Power2Stage is an amazing piece – 73 minutes for Methods B and C! It took the PQRI 1½ years to come up with their simulations in Compaq Visual Fortran.

Method B: Empiric type I error; Pot = Potvin’s 0.0294, HB = homebrew’s 0.0304.

           12             24             36             48             60       CV     Pot    HB      Pot    HB      Pot    HB      Pot    HB      Pot    HB   0.1  0.0297 0.0304  0.0294 0.0302  0.0294 0.0303  0.0292 0.0303  0.0294 0.0302 0.2  0.0463 0.0474  0.0320 0.0324  0.0294 0.0303  0.0292 0.0303  0.0297 0.0302 0.3  0.0437 0.0453  0.0475 0.0488  0.0397 0.0405  0.0324 0.0328  0.0296 0.0304 0.4  0.0344 0.0358  0.0433 0.0448  0.0485 0.0501  0.0458 0.0468  0.0409 0.0416 0.5  0.0309 0.0321  0.0338 0.0353  0.0420 0.0434  0.0484 0.0500  0.0483 0.0494 0.6  0.0297 0.0310  0.0307 0.0320  0.0333 0.0343  0.0399 0.0419  0.0466 0.0485 0.7  0.0294 0.0304  0.0299 0.0315  0.0306 0.0316  0.0328 0.0337  0.0381 0.0397 0.8  0.0292 0.0302  0.0298 0.0310  0.0303 0.0309  0.0303 0.0313  0.0318 0.0330 0.9  0.0289 0.0300  0.0298 0.0309  0.0296 0.0309  0.0297 0.0308  0.0300 0.0311 1.0  0.0291 0.0300  0.0298 0.0308  0.0298 0.0307  0.0297 0.0307  0.0301 0.0307

Method C: Empiric type I error; Pot = Potvin’s 0.0294, HB = homebrew’s 0.0282.

           12             24             36             48             60       CV     Pot    HB      Pot    HB      Pot    HB      Pot    HB      Pot    HB   0.1  0.0496 0.0499  0.0500 0.0496  0.0500 0.0499  0.0501 0.0498  0.0504 0.0497 0.2  0.0510 0.0500  0.0490 0.0491  0.0499 0.0499  0.0495 0.0498  0.0500 0.0497 0.3  0.0441 0.0421  0.0492 0.0475  0.0477 0.0471  0.0494 0.0492  0.0502 0.0496 0.4  0.0346 0.0330  0.0435 0.0410  0.0489 0.0469  0.0469 0.0457  0.0470 0.0457 0.5  0.0311 0.0299  0.0339 0.0322  0.0418 0.0395  0.0480 0.0458  0.0483 0.0461 0.6  0.0299 0.0288  0.0307 0.0297  0.0331 0.0314  0.0399 0.0379  0.0472 0.0445 0.7  0.0294 0.0280  0.0298 0.0290  0.0308 0.0291  0.0325 0.0310  0.0380 0.0357 0.8  0.0292 0.0279  0.0301 0.0287  0.0299 0.0287  0.0302 0.0291  0.0319 0.0304 0.9  0.0285 0.0279  0.0298 0.0286  0.0296 0.0284  0.0298 0.0285  0.0301 0.0288 1.0  0.0290 0.0276  0.0295 0.0287  0.0297 0.0285  0.0297 0.0285  0.0297 0.0284

Maximum inflation in red.
With the new alphas no (!) significant inflation for both methods. Largest observed in “Method B” 0.050111 (at 36/0.4) and in “Method C” 0.049984 (at 12/0.2).

I’m getting the impression that if PQRI would have had a closer look right from the start (instead of coming up with a ‘one size fits all’ α and playing with a “negligible inflation”), maybe we could have avoided all those effectless discussions we had the last years.

BTW, in Montague’s paper I read “The simulations were performed using R […].” Nice to know. Then “A different randomly selected seed was used for each scenario.” Why? Shall we switch to setseed=FALSE in Power2Stage?
Anders’ algo suggests 0.027 (instead of 0.028) for “Method D”. Sim’s running.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2014-06-03 08:47
(edited by d_labes on 2014-06-03 09:28)

@ Helmut
Posting: # 13030
Views: 9,058

## Potvin revis(it)ed

Dear Helmut!

Again !
That's worth a paper.

» I’m getting the impression that if PQRI would have had a closer look right from the start (instead of coming up with a ‘one size fits all’ α and playing with a “negligible inflation”), maybe we could have avoided all those effectless discussions we had the last years.

I think their approach was driven by the ability to perform only a limited number of study simulations per day, especially if the number of subjects rises.
Thus they have taken Pocock's 'universal' constant and simply looked what happened. Instead of taking the approach as what it is: These nominal alpha's are lacking any theoretical justification and are used on a purely empirical basis for the BE decision in a crossover with interim sample size adaption. And thus should be adapted to this problem.
As they have done later in method D and Montague’s paper.
And as Anders had done in his epoch-making paper
A. Fuglsang
"Controlling type I errors for two-stage bioequivalence study designs"
Clinical Research and Regulatory Affairs, 2011; 28(4): 100–105

» BTW, in Montague’s paper I read “The simulations were performed using R […].” Nice to know.

R rulez !

» Then “A different randomly selected seed was used for each scenario.” Why? Shall we switch to setseed=FALSE in Power2Stage?

As I read this sentence they used a different seed, randomly selected, but fixed for each scenario (CV, n1) to be protected against artefacts resulting from the starting point of the pseudo-random number generator.
IMHO using a different but fixed seed or a single fixed seed doesn't make a big difference if we simulate with 1E6 sims.
Setting setseed=FALSE in Power2Stage would have the drawback of differing results even for one scenario (CV, n1) if run again.

BTW: What is Anders algo? His C programs?

Regards,

Detlew
Helmut
★★★

Vienna, Austria,
2014-06-03 13:49

@ d_labes
Posting: # 13032
Views: 9,004

## Montague revis(it)ed

Dear Detlew!

» That's worth a paper.

Who’s going to write one?

» I think their approach was driven by […]

Agree. Maybe their choice was (partly?) strategic. Since Pocock’s 0.0294 has a tradition of decades of accepted use in phase III, they opted for a smooth introduction: “We used it for such a long time and it seems to work here as well…” In Section 5 they wrote: “It is our under­stand­ing that the FDA has accepted studies with designs like those considered here.” If I recall it correctly they were refer­ring to an NDA (bonus question: Guess the α!). Donald Schuir­mann wrote the bio­statis­tical as­sess­ment based on simulations he performed and accepted the study. I have to dig out the reference.

» As they have done later in method D and Montague’s paper.

Yep – but not consequently enough. Their maximum inflation of 0.0518 is the largest one of all papers published so far. Again I assume they tried to keep it simple: “We have shown in our first paper that for a T/R of 0.95 D’s 0.0280 is more conservative than C’s 0.0294. Let’s give it a try with a T/R of 0.9. Hey, <0.052 – let’s publish.”

Method D: Empiric type I error; Mont = Montague’s 0.0280, HB = homebrew’s 0.0270 (runtime one hour).

           12             24             36             48             60      CV     Mont   HB      Mont   HB      Mont   HB      Mont   HB      Mont   HB  0.1  0.0498 0.0494  0.0501 0.0496  0.0504 0.0499  0.0503 0.0498  0.0498 0.0497 0.2  0.0518 0.0500  0.0475 0.0463  0.0477 0.0469  0.0499 0.0494  0.0498 0.0497 0.3  0.0414 0.0396  0.0506 0.0490  0.0489 0.0473  0.0470 0.0451  0.0449 0.0439 0.4  0.0322 0.0307  0.0414 0.0395  0.0502 0.0483  0.0500 0.0484  0.0492 0.0469 0.5  0.0293 0.0283  0.0316 0.0304  0.0401 0.0382  0.0484 0.0465  0.0509 0.0487 0.6  0.0286 0.0277  0.0292 0.0282  0.0313 0.0296  0.0381 0.0360  0.0462 0.0441 0.7  0.0286 0.0273  0.0281 0.0276  0.0288 0.0275  0.0307 0.0292  0.0358 0.0340 0.8  0.0280 0.0273  0.0283 0.0273  0.0287 0.0271  0.0284 0.0273  0.0301 0.0286 0.9  0.0281 0.0271  0.0281 0.0275  0.0282 0.0270  0.0281 0.0271  0.0285 0.0273 1.0  0.0282 0.0271  0.0282 0.0273  0.0282 0.0271  0.0282 0.0270  0.0279 0.0273

No significant inflation with 0.0270. Largest observed 0.049994 (at 12/0.2; same location as reported by Montague).

» IMHO using a different but fixed seed or a single fixed seed doesn't make a big difference if we simulate with 1E6 sims.
» Setting setseed=FALSE in Power2Stage would have the drawback of differing results even for one scenario (CV, n1) if run again.

Agree.

» BTW: What is Anders algo?

The linear regression of type I errors vs. αadj in order to find αadj leading to TIE 0.05 (imple­mented in the second step of my code). I borrowed the idea from his 2011 paper you quoted. I’m afraid regu­la­tors likely will not accept his method of data-driven adjusting α in the interim step (EMA: “The plan to use a two-stage approach must be pre-specified in the protocol along with the ad­justed signi­fi­cance levels to be used for each of the analyses”).

» His C programs?

I have just a few of them. Given your last improvements Power2Stage is almost as fast as his com­piled stuff.

PS: Seems to be a popular topic. ~100 visits / day so far.…

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Helmut
★★★

Vienna, Austria,
2014-10-13 14:53

@ d_labes
Posting: # 13692
Views: 8,502

## Pocock’s “natural constant”

Dear Detlew,

have you every wondered where the magick 0.0294 comes from? In R we can do better, of course:

require(mvtnorm) alpha   <- 0.05 mu      <- c(0, 0) sigma   <- diag(2) sigma[sigma == 0] <- 1/sqrt(2) z       <- qmvnorm(1-0.05002333, tail="both.tails", mean=mu, sigma=sigma)$quantile p0 <- pmvnorm(lower=rep(-z, 2), upper=rep(z, 2), mean=mu, sigma=sigma)[1] C <- qmvnorm(1-alpha, tail="both.tails", mean=mu, sigma=sigma)$quantile p1      <- pmvnorm(lower=rep(-C, 2), upper=rep(C, 2), mean=mu, sigma=sigma)[1] alpha1 <- 2*(1-pnorm(C)) head   <- "%s" body   <- "%s  %-10.7g  %-8.7g  %-8.7g  %+.7f%s" title1 <- "\nPocock (1977) Table 1. \u03b1 = 0.05??\n" title2 <- "N    \u03b1\u2019          z         \u03b1            RSE\n" cat(paste0(   sprintf(head, title1),   sprintf(head, title2),   sprintf(body, "2", 0.0294, 2.178, 1-p0, 100*(1-p0-0.05)/0.05, "%\n"),   sprintf(body, "2", alpha1, C, 1-p1, 100*(1-p1-0.05)/0.05, "%\n"))) Pocock (1977) Table 1. α = 0.05?? N    α’          z         α            RSE 2  0.0294      2.178     0.05002322  +0.0464379% 2  0.02938572  2.178273  0.04999989  -0.0002221%

Nitpicking as usual. Compare at the location of maximum inflation …

Method C: alpha0= 0.05, alpha (s1/s2)= 0.02938572 0.02938572 Futility criterion Nmax= Inf CV= 0.2; n(stage 1)= 12; GMR= 0.95 BE margins = 0.8 ... 1.25 GMR= 0.95 and mse of stage 1 in sample size est. used 1e+06 sims at theta0= 1.25 (p(BE)='alpha'). p(BE)   = 0.051252 p(BE) s1= 0.035768 pct studies in stage 2= 78.86% Distribution of n(total) - mean (range)= 22.9 (12 ... 98) - percentiles  5% 50% 95%  12  22  40

… with the usual stuff:

Method C: alpha0= 0.05, alpha (s1/s2)= 0.0294 0.0294 Futility criterion Nmax= Inf CV= 0.2; n(stage 1)= 12; GMR= 0.95 BE margins = 0.8 ... 1.25 GMR= 0.95 and mse of stage 1 in sample size est. used 1e+06 sims at theta0= 1.25 (p(BE)='alpha'). p(BE)   = 0.051247 p(BE) s1= 0.035777 pct studies in stage 2= 78.86% Distribution of n(total) - mean (range)= 22.9 (12 ... 98) - percentiles  5% 50% 95%  12  22  40

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Belgium?,
2014-10-13 15:30

@ Helmut
Posting: # 13693
Views: 8,243

## Pocock’s “natural constant”

Hi Hötzi,

» N    α’          z         α            RSE
» 2  0.0294      2.178     0.05002322  +0.0464379%
» 2  0.02938572  2.178273  0.04999989  -0.0002221%

It's a scandal!!

Le tits now.

Best regards,
ElMaestro
d_labes
★★★

Berlin, Germany,
2014-10-14 08:56

@ Helmut
Posting: # 13695
Views: 8,202

## Another “natural constant”?

Dear Helmut,

» require(mvtnorm)
Require or not require, that's the question .

» alpha   <- 0.05
» mu      <- c(0, 0)
» sigma   <- diag(2)
» sigma[sigma == 0] <- 1/sqrt(2)
» z       <- qmvnorm(1-0.05002333, tail="both.tails", mean=mu, sigma=sigma)$quantile Another magick number? Regards, Detlew Helmut ★★★ Vienna, Austria, 2014-10-14 13:36 @ d_labes Posting: # 13701 Views: 8,132 ## Λ Dear Detlew, » » require(mvtnorm) » Require or not require, that's the question . Oh, I didn’t know that. I assumed that require() attachs a package only if is not already attached. Should have RTFM, which states: “require is designed for use inside other functions …” » » z <- qmvnorm(1-0.05002333, tail="both.tails", mean=mu, sigma=sigma)$quantile
»
» Another magick number?

Exactly. In Pocock’s paper both z and α’ are rounded (2.178, 0.0294). I had to introduce my magick number in order to end up with 2.178.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes