Bioequivalence and Bioavailability Forum

Helmut
★★★

Vienna, Austria,
2014-05-31 20:20
(3975 d 19:04 ago)

Posting: # 13024
Views: 14,549

How to find a suitable adjusted α? [Two-Stage / GS Designs]

Dear all,

have you ever wondered how to find a suitable adjusted α if the desired combination of target power, expected T/R-ratio (and even the acceptance range) is not given in any of the many publications? In such a case we have to validate the framework within the desired range of n₁/CV-combinations in order to demonstrate that the overall type I error is preserved.
Here is my recipe – inspired by Fuglsang’s¹ ideas:

Explore the desired grid of n₁/CV-combinations.
Use Pocock’s α 0.0294 as a starter; run 10⁵ simulations for speed reasons.
Find the location of maximum inflation. For variants of “Method C” the maximum inflation is generally seen at the minimum n₁ and CV ~10–25%; for variants of “Method B” it can be anywhere…
At this location simulate a vector of adjusted alphas (10⁶ sim’s). Lower limit 0.025 (Bonferroni), upper limit α_max/2+10% (a feasible guess in most cases).
Perform a linear regression of type I errors vs. α_adj (THX to Anders!); the intersection of the regression line with 0.05 is the “best” α_adj.
Validate the grid with the chosen α_adj. I suggest a narrower grid of n₁/CV-combinations (at least around the planned n₁ and expected CV). 10⁶ sim’s per grid-point are mandatory.

In R by means of package Power2Stage:

library(Power2Stage)

##############

# First step #

##############

n1.min  <- 12

n1.max  <- 60

n1.step <- 12

CV.min  <- 0.1

CV.max  <- 0.8

CV.step <- 0.05

method  <- "B"

GMR     <- 0.95 # assumed T/R-ratio

theta0  <- 1.25 # upper BE-limit for simulating alpha

target  <- 0.80 # target power (generally 0.8–0.9)

usePE   <- FALSE

alpha   <- c(0.0294, 0.0294)

           # Pocock's alpha.

           # Change only if you know what you are doing!

Nmax    <- Inf

           # No upper sample size limit (futility).

           # Change only if you know what you are doing!

           # Generally alpha is not affected, but have a

           # close look at power!

# Potvin's defaults: n1 12-60 (step 12), CV 0.1-1.0 (step 0.1), GMR 0.95, target 0.8

n1 <- seq(n1.min, n1.max, by=n1.step)

CV <- seq(CV.min, CV.max, by=CV.step)

nsims <- 1e5

pBE <- NULL

for(j in 1:length(CV)){

  for(k in 1:length(n1)){

    pBE <- c(pBE, power.2stage(method=method, alpha=alpha,

            n1=n1[k], GMR=GMR, CV=CV[j], targetpower=target,

            pmethod="nct", usePE=usePE, Nmax=Nmax, theta0=theta0,

            nsims=nsims, print=F, details=F)$pBE)

  }

}

grid <- matrix(data=pBE, nrow=length(CV), ncol=length(n1), byrow=T,

  dimnames=list(CV, n1))

pos <- as.numeric(which(grid == max(grid), arr.ind=TRUE))

CV.pos <- as.numeric(rownames(grid)[pos[1]])

n1.pos <- as.numeric(colnames(grid)[pos[2]])

alpha.max <- grid[pos[1], pos[2]]

grid

cat(paste0("\nFor adj. \u03b1 of ", alpha[1], " a maximum inflation of ",

  alpha.max, " is detected at CV=", CV.pos*100, "% and n1=", n1.pos, ".\n"))

LL <- 0.025           # Bonferroni (most conservative)

UL <- alpha.max/2*1.1 # guess!

if(UL <= alpha[1]) UL <- alpha[1]*1.1 # modification for small inflation

###############

# Second step #

###############

nsims <- 1e6

x <- seq(LL, UL, length.out=10)

pBE <- NULL

for(j in 1:length(x)){

    pBE <- c(pBE, power.2stage(method=method, alpha=rep(x[j], 2),

            n1=n1.pos, GMR=GMR, CV=CV.pos, targetpower=target,

            pmethod="nct", usePE=usePE, Nmax=Nmax, theta0=theta0,

            nsims=nsims, print=F, details=F)$pBE)

}

mod <- lm(pBE ~ x)

alpha.ad <- rep(round(as.numeric((0.05-coef(mod)[1])/coef(mod)[2]), 4), 2)

sig <- binom.test(0.05*nsims, nsims, alternative="less")$conf.int[2]


plot(x, pBE, xlab="adj. alpha", ylab="emp. alpha",

  main=paste0("“Method ", method, "”, Inflation at CV = ",

  CV.pos*100, "% and n1 = ", n1.pos))

  abline(h=0.05, col="red")         # target alpha

  abline(h=sig, lty="dotted", col="red") # significance limit

  abline(mod)                       # regression line

  abline(v=alpha[1], lty="dotted")  # first guess (e.g., 0.0294)

  abline(v=alpha.ad[1], col="blue") # estimated adjusted alpha

cat("estimated adjusted \u03b1:", alpha.ad[1], "\n")

##############

# Third step #

##############

pBE <- NULL

for(j in 1:length(CV)){

  for(k in 1:length(n1)){

    pBE <- c(pBE, power.2stage(method=method, alpha=alpha.ad,

            n1=n1[k], GMR=GMR, CV=CV[j], targetpower=target,

            pmethod="nct", usePE=usePE, Nmax=Nmax, theta0=theta0,

            nsims=nsims, print=F, details=F)$pBE)

  }

}

grid <- matrix(data=pBE, nrow=length(CV), ncol=length(n1), byrow=T,

  dimnames=list(CV, n1))

pos <- as.numeric(which(grid == max(grid), arr.ind=TRUE))

CV.pos <- as.numeric(rownames(grid)[pos[1]])

n1.pos <- as.numeric(colnames(grid)[pos[2]])

alpha.max <- grid[pos[1], pos[2]]

cat(paste0("\nValidation of chosen adjusted \u03b1 (", alpha.ad[1], "):\n",

  "\u201cMethod ", method, "\u201d, assumed T/R ", GMR*100, "%, ",

  "target power ", target*100, "%\n"))

grid

cat(paste0("\nWith an adj. \u03b1 of ", alpha.ad[1], " a maximum inflation of ",

  alpha.max, " is seen at CV=", CV.pos*100, "% and n1=", n1.pos, ".\n"))

if(alpha.max <= sig){

  cat(paste0("\nSignificance limit for ", nsims, " simulations: ",

  round(sig, 5), "; no significant inflation seen.",

  "\nThe chosen adjusted \u03b1 is justified within the assessed range of ",

  "stage 1 sample sizes (", n1.min, "\u002d", n1.max,") and CVs (", CV.min*100,

  "\u002d", CV.max*100, "%).\n"))

}else{

  cat(paste0("\nSignificance limit for ", nsims, " simulations: ",

  round(sig, 5), "; significant inflation seen.\n"))

  if(alpha.max < 0.052){

    cat(paste0("However, the maximum inflation of ", alpha.max,

    " does not exceed Potvin\u2019s \u201cnegligible inflation\u201d of 0.052. ",

    "Therefore, the chosen adjusted \u03b1 is justified within the assessed range of ",

    "stage 1 sample sizes (", n1.min, "\u002d", n1.max,") and CVs (", CV.min*100,

    "\u002d", CV.max*100, "%).\n"))

  }else{

    cat(paste("The maximum inflation of", alpha.max,

    "exceeds Potvin\u2019s \u201cnegligible inflation\u201d of 0.052.",

    "Therefore, a smaller adjusted \u03b1 should be tried!\n"))

  }

}

I got:

          12      24      36      48      60

0.1  0.03995 0.02985 0.02976 0.02978 0.02988

0.15 0.05234 0.04226 0.03234 0.02987 0.02988

0.2  0.05408 0.05068 0.04561 0.03912 0.03279

0.25 0.05011 0.05369 0.05047 0.04814 0.04406

0.3  0.04262 0.05398 0.05317 0.05117 0.04947

0.35 0.03668 0.05042 0.05272 0.05364 0.05152

0.4  0.03228 0.04438 0.05266 0.05389 0.05275

0.45 0.03112 0.03727 0.04858 0.05410 0.05367

0.5  0.02959 0.03287 0.04243 0.05190 0.05319

0.55 0.02940 0.03146 0.03642 0.04705 0.05209

0.6  0.02901 0.03020 0.03251 0.04075 0.04876

0.65 0.02888 0.03017 0.03036 0.03586 0.04347

0.7  0.02849 0.03006 0.02983 0.03249 0.03820

0.75 0.02839 0.02964 0.02867 0.03095 0.03348

0.8  0.02847 0.03000 0.02953 0.03005 0.03117



For adj. α of 0.0294 a maximum inflation of 0.0541 is detected at CV=45% and n1=48.

estimated adjusted α: 0.0274

           12       24       36       48       60

0.1  0.037296 0.027386 0.027369 0.027309 0.027285

0.15 0.049151 0.039538 0.030016 0.027398 0.027285

0.2  0.051255 0.047733 0.042950 0.036368 0.030491

0.25 0.047249 0.049676 0.047458 0.044682 0.041669

0.3  0.040369 0.050304 0.049422 0.047641 0.045841

0.35 0.034418 0.047104 0.050399 0.049285 0.048116

0.4  0.031017 0.040444 0.049617 0.049995 0.049238

0.45 0.029327 0.034318 0.045297 0.049996 0.050201

0.5  0.028601 0.030571 0.039118 0.047798 0.050117

0.55 0.028188 0.029017 0.033528 0.042823 0.048769

0.6  0.027837 0.028173 0.030263 0.036796 0.045379

0.65 0.027771 0.027985 0.028479 0.032260 0.039910

0.7  0.027626 0.027742 0.027986 0.029644 0.034807

0.75 0.027706 0.027803 0.027488 0.028406 0.031237

0.8  0.027586 0.027720 0.027595 0.027846 0.028965



With an adj. α of 0.0274 a maximum inflation of 0.051255 is seen at CV=20% and n1=12.



Significance limit for 1e+06 simulations: 0.05036; significant inflation seen.

However, the maximum inflation of 0.051255 does not exceed Potvin’s “negligible inflation” of 0.052. Therefore, the chosen adjusted α is justified within the assessed range of stage 1 sample sizes (12-60) and CVs (10-80%).

Get a decent cup of coffee – it takes a while (on my machine 11 min for Step 1 and 10 min for Step 2) Depending on the chosen grid expect up to a couple of days^a for Step 3. The simple grid of 75·10⁶ sim’s took six hours to complete. :smoke:

For “Method B”, T/R-ratio 0.9, 90% power I got 0.0274. Slightly larger than the 0.0269 Fuglsang reported² for “Method C”. Not surprising, since B is always more conservative than C. In other words a slightly larger α is expected to lead to a similar inflation.
BTW, for “Method C” I got 0.0268 (10 & 17 min). IMHO, a nice agreement^b (different software: C vs. R, different seeds of the pseudo-random generator, different power-methods: shifted t vs. noncentral t).

Don’t forget the third step – regulators want to see only that (EMA: “appropriate steps must be taken to preserve the overall type I error of the experiment” and “the choice of how much alpha to spend at the interim analysis is at the company’s discretion”). If you want to introduce a futility criterion (e.g., an upper total sample size or even fiddle with usePE=TRUE), simulating power is crucial in order to avoid a nasty surprise.

Fuglsang A. Controlling type I errors for two-stage bioequivalence study designs. Clin Res Regul Aff. 2011;28(4):100–5. doi:10.3109/10601333.2011.631547
Fuglsang A. Sequential Bioequivalence Trial Designs with Increased Power and Controlled Type I Error Rates. AAPS J. 2013;15(3):659–61. doi:10.1208/s12248-013-9475-5

Hint: Split the grid and run more than one R-session simultaneously. Two sessions on my machine suck up only 25% of the CPU’s resources.
A comparison of “Method C”, 40·10⁶ sim’s. HB = my homebrew (0.0268), AF = Anders Fuglsang (0.0269):
12 24 36 48 60 CV HB AF HB AF HB AF HB AF HB AF 0.1 0.0488 0.0484 0.0496 0.0498 0.0499 0.0498 0.0498 0.0500 0.0497 0.0499 0.2 0.0500 0.0501 0.0468 0.0472 0.0444 0.0445 0.0456 0.0459 0.0487 0.0486 0.3 0.0392 0.0394 0.0492 0.0496 0.0485 0.0486 0.0468 0.0474 0.0451 0.0452 0.4 0.0304 0.0307 0.0394 0.0399 0.0482 0.0486 0.0491 0.0494 0.0484 0.0485 0.5 0.0279 0.0282 0.0301 0.0296 0.0379 0.0382 0.0466 0.0469 0.0489 0.0492 0.6 0.0272 0.0273 0.0276 0.0275 0.0293 0.0296 0.0357 0.0362 0.0441 0.0442 0.7 0.0271 0.0271 0.0271 0.0273 0.0273 0.0272 0.0287 0.0290 0.0339 0.0339 0.8 0.0269 0.0267 0.0269 0.0272 0.0268 0.0272 0.0271 0.0273 0.0283 0.0284
31 (78%) smaller, 2 (5%) identical, 7 (18%) larger.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2014-06-02 10:28
(3974 d 04:56 ago)

@ Helmut
Posting: # 13025
Views: 11,964

Suitable code for suitable adjusted α

Post reply

Dear Helmut!

:ok:

Thanks for sharing this code!

Just one minor comment or better question:
How is your experience in regulatory acceptance of Potvin's 'acceptable' alpha-inflation of 0.052? If I remember correctly there was some rumour that even a smaller value of the empirical alpha has to be seen as inflation. That would mean that we had to down weight the adj. alpha to some extent.

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2014-06-02 16:44
(3973 d 22:40 ago)

@ d_labes
Posting: # 13026
Views: 12,167

Handling inflation

Post reply

Dear Detlew!

❝ Thanks for sharing this code!

My pleasure! I think that we still need some improvements:

In my example the largest inflation was seen at 48/0.45 with 0.05410 and the second largest at 12/0.20 with 0.05408. However, in step 3 the largest inflation showed up at 12/0.20. It might well be that the result of step 1 was an artifact due to only 10⁵ sim’s performed. With 10⁶ I got for 12/0.20 0.054605 (↑) and for 48/0.45 0.05374 (↓), confirming the location of the maximum picked by my code was due to chance.
In most methods (except Karalis/Macheras) the “ridge” of the “type I error-surface” follows roughly the diagonal of the grid – from low n₁/low CV to high n₁/high CV (see the contour-plots of Potvin’s Methods B and C to the right).
Maybe it would make sense to split the first step:
1. Get preliminary estimates for all combos obtained from 10⁵ sim’s like before.*
2. Repeat for the largest inflation observed in each column (i.e., the n₁); this time with 10⁶ sim’s. Should be closer to the “true” maximum and avoid the trap mentioned above.
If I repeat the second step at 12/0.20 I get an adjusted α of 0.0268 – which is substantially smaller than 0.0274 (and identical to what I got for “Method C”, which is counterintuitive, IMHO).

Overall α (step 3) for this combo decreases from 0.051255 to 0.049795… If we follow this track I expect lower alphas than in any of the published papers. Does that make sense?

❝ How is your experience in regulatory acceptance of Potvin's 'acceptable' alpha-inflation of 0.052?

Mixed. Some European (!) regulators don’t like (‼) TSDs at all, but accept them if following “Method B” (quote: “according to the guideline…”). In one case Austria’s AGES asked for a posteriori confirmation of “lacking inflation” of “Method C” based on the actual sample size and CV in the study (was 0.0494 with a 95% CI of 0.0490–0.0498). Duno what might have happened if 0.05 <α ≤0.052. According to Chinese whispers ≤0.051 is considered acceptable. Why? Duno. The maximum inflation in Potvin’s paper for “Method B” is 0.0485 and for “Method C” 0.051. Maybe someone read the wrong column.

❝ If I remember correctly there was some rumour that even a smaller value of the empirical alpha has to be seen as inflation.

If I recall correctly that’s the personal opinion of the Austrian member or EMA’s Biostatistics Working Party. Recently a member of the PKWP told me how he made peace with TSDs – after years of lurking doubt: “The inflation would be relevant only if the CI in the study covers exactly 80–125%. Since in real life the CI is narrower, the actual patient’s risk – even if there would be a small inflation due to the method – likely is ≪5%. So I don’t bother any more.” Pragmatic approach. :-D

❝ That would mean that we had to down weight the adj. alpha to some extent.

By throwing away all published papers and increasing the downloads of Power2Stage? Of course PQRI’s Sequential Design Working Group’s “negligible inflation” (≤0.052) is arbitrary – as are many other rules we have to observe in BE. BTW, only Montague’s “Method D” scratches 0.052. Results of the publications:

α_adj α_max sign. >0.05? Potvin 2008 B 0.0294 0.0485 Potvin 2008 C 0.0294 0.0510 yes Montague 2011 D 0.0280 0.0518 yes Fuglsang 2013 B 0.0284 0.0501 no Fuglsang 2013 D1 0.0274 0.0503 no Fuglsang 2013 D2 0.0269 0.0501 no Fuglsang 2014 B 0.0294 0.0478 no Fuglsang 2014 C 0.0294 0.0506 yes

Unless the EMA makes a clear statement about their preferred maximum inflation (or at least someone receives a deficiency letter and hopefully reports here) I would not change anything.

Running my original code with the setup of Potvin’s paper I “detect” a maximum inflation of 0.04882 at 36/0.40 (B) and 0.0513 at 12/0.20 (C) in the first step.
Good news: These locations agree with the ones reported.
Bad news: My guess of the upper limit for the regression is crap if there is only a small inflation with the α employed in the first step. I suggest this modification:
UL <- alpha.max/2*1.1 if(UL <= alpha[1]) UL <- alpha[1]*1.1
I got:

Anders’ algo suggests ↑0.0304 for B and ↓0.0282 for C. Interesting. Validation under way.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2014-06-03 00:07
(3973 d 15:17 ago)

@ d_labes
Posting: # 13028
Views: 12,273

Potvin revis(it)ed

Post reply

Dear Detlew!

Below the results of my simulations (I used α 0.0304 for “Method B” and 0.0282 for “Method C”). Power2Stage is an amazing piece – 73 minutes for Methods B and C! It took the PQRI 1½ years to come up with their simulations in Compaq Visual Fortran. ;-)

Method B: Empiric type I error; Pot = Potvin’s 0.0294, HB = homebrew’s 0.0304.

12 24 36 48 60 CV Pot HB Pot HB Pot HB Pot HB Pot HB 0.1 0.0297 0.0304 0.0294 0.0302 0.0294 0.0303 0.0292 0.0303 0.0294 0.0302 0.2 0.0463 0.0474 0.0320 0.0324 0.0294 0.0303 0.0292 0.0303 0.0297 0.0302 0.3 0.0437 0.0453 0.0475 0.0488 0.0397 0.0405 0.0324 0.0328 0.0296 0.0304 0.4 0.0344 0.0358 0.0433 0.0448 0.0485 0.0501 0.0458 0.0468 0.0409 0.0416 0.5 0.0309 0.0321 0.0338 0.0353 0.0420 0.0434 0.0484 0.0500 0.0483 0.0494 0.6 0.0297 0.0310 0.0307 0.0320 0.0333 0.0343 0.0399 0.0419 0.0466 0.0485 0.7 0.0294 0.0304 0.0299 0.0315 0.0306 0.0316 0.0328 0.0337 0.0381 0.0397 0.8 0.0292 0.0302 0.0298 0.0310 0.0303 0.0309 0.0303 0.0313 0.0318 0.0330 0.9 0.0289 0.0300 0.0298 0.0309 0.0296 0.0309 0.0297 0.0308 0.0300 0.0311 1.0 0.0291 0.0300 0.0298 0.0308 0.0298 0.0307 0.0297 0.0307 0.0301 0.0307

Method C: Empiric type I error; Pot = Potvin’s 0.0294, HB = homebrew’s 0.0282.

12 24 36 48 60 CV Pot HB Pot HB Pot HB Pot HB Pot HB 0.1 0.0496 0.0499 0.0500 0.0496 0.0500 0.0499 0.0501 0.0498 0.0504 0.0497 0.2 0.0510 0.0500 0.0490 0.0491 0.0499 0.0499 0.0495 0.0498 0.0500 0.0497 0.3 0.0441 0.0421 0.0492 0.0475 0.0477 0.0471 0.0494 0.0492 0.0502 0.0496 0.4 0.0346 0.0330 0.0435 0.0410 0.0489 0.0469 0.0469 0.0457 0.0470 0.0457 0.5 0.0311 0.0299 0.0339 0.0322 0.0418 0.0395 0.0480 0.0458 0.0483 0.0461 0.6 0.0299 0.0288 0.0307 0.0297 0.0331 0.0314 0.0399 0.0379 0.0472 0.0445 0.7 0.0294 0.0280 0.0298 0.0290 0.0308 0.0291 0.0325 0.0310 0.0380 0.0357 0.8 0.0292 0.0279 0.0301 0.0287 0.0299 0.0287 0.0302 0.0291 0.0319 0.0304 0.9 0.0285 0.0279 0.0298 0.0286 0.0296 0.0284 0.0298 0.0285 0.0301 0.0288 1.0 0.0290 0.0276 0.0295 0.0287 0.0297 0.0285 0.0297 0.0285 0.0297 0.0284

Maximum inflation in red.
With the new alphas no (!) significant inflation for both methods. Largest observed in “Method B” 0.050111 (at 36/0.4) and in “Method C” 0.049984 (at 12/0.2).

I’m getting the impression that if PQRI would have had a closer look right from the start (instead of coming up with a ‘one size fits all’ α and playing with a “negligible inflation”), maybe we could have avoided all those effectless discussions we had the last years. :crying:

BTW, in Montague’s paper I read “The simulations were performed using R […].” Nice to know. Then “A different randomly selected seed was used for each scenario.” Why? Shall we switch to setseed=FALSE in Power2Stage?
Anders’ algo suggests 0.027 (instead of 0.028) for “Method D”. Sim’s running.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2014-06-03 10:47
(3973 d 04:37 ago)

@ Helmut
Posting: # 13030
Views: 11,921

Potvin revis(it)ed

Post reply

Dear Helmut!

Again :clap:

!
That's worth a paper.

❝ I’m getting the impression that if PQRI would have had a closer look right from the start (instead of coming up with a ‘one size fits all’ α and playing with a “negligible inflation”), maybe we could have avoided all those effectless discussions we had the last years. :crying:

I think their approach was driven by the ability to perform only a limited number of study simulations per day, especially if the number of subjects rises.
Thus they have taken Pocock's 'universal' constant :-D

and simply looked what happened. Instead of taking the approach as what it is: These nominal alpha's are lacking any theoretical justification and are used on a purely empirical basis for the BE decision in a crossover with interim sample size adaption. And thus should be adapted to this problem.
As they have done later in method D and Montague’s paper.
And as Anders had done in his epoch-making paper
A. Fuglsang
"Controlling type I errors for two-stage bioequivalence study designs"
Clinical Research and Regulatory Affairs, 2011; 28(4): 100–105

❝ BTW, in Montague’s paper I read “The simulations were performed using R […].” Nice to know.

R rulez

❝ Then “A different randomly selected seed was used for each scenario.” Why? Shall we switch to setseed=FALSE in Power2Stage?

As I read this sentence they used a different seed, randomly selected, but fixed for each scenario (CV, n1) to be protected against artefacts resulting from the starting point of the pseudo-random number generator.
IMHO using a different but fixed seed or a single fixed seed doesn't make a big difference if we simulate with 1E6 sims.
Setting setseed=FALSE in Power2Stage would have the drawback of differing results even for one scenario (CV, n1) if run again.

BTW: What is Anders algo? His C programs?

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2014-06-03 15:49
(3972 d 23:35 ago)

@ d_labes
Posting: # 13032
Views: 11,889

Montague revis(it)ed

Post reply

Dear Detlew!

❝ That's worth a paper.

Who’s going to write one? ;-)

❝ I think their approach was driven by […]

Agree. Maybe their choice was (partly?) strategic. Since Pocock’s 0.0294 has a tradition of decades of accepted use in phase III, they opted for a smooth introduction: “We used it for such a long time and it seems to work here as well…” In Section 5 they wrote: “It is our understanding that the FDA has accepted studies with designs like those considered here.” If I recall it correctly they were referring to an NDA (bonus question: Guess the α!). Donald Schuirmann wrote the biostatistical assessment based on simulations he performed and accepted the study. I have to dig out the reference.

❝ As they have done later in method D and Montague’s paper.

Yep – but not consequently enough. Their maximum inflation of 0.0518 is the largest one of all papers published so far. Again I assume they tried to keep it simple: “We have shown in our first paper that for a T/R of 0.95 D’s 0.0280 is more conservative than C’s 0.0294. Let’s give it a try with a T/R of 0.9. Hey, <0.052 – let’s publish.”

[image]

Method D: Empiric type I error; Mont = Montague’s 0.0280, HB = homebrew’s 0.0270 (runtime one hour).

12 24 36 48 60 CV Mont HB Mont HB Mont HB Mont HB Mont HB 0.1 0.0498 0.0494 0.0501 0.0496 0.0504 0.0499 0.0503 0.0498 0.0498 0.0497 0.2 0.0518 0.0500 0.0475 0.0463 0.0477 0.0469 0.0499 0.0494 0.0498 0.0497 0.3 0.0414 0.0396 0.0506 0.0490 0.0489 0.0473 0.0470 0.0451 0.0449 0.0439 0.4 0.0322 0.0307 0.0414 0.0395 0.0502 0.0483 0.0500 0.0484 0.0492 0.0469 0.5 0.0293 0.0283 0.0316 0.0304 0.0401 0.0382 0.0484 0.0465 0.0509 0.0487 0.6 0.0286 0.0277 0.0292 0.0282 0.0313 0.0296 0.0381 0.0360 0.0462 0.0441 0.7 0.0286 0.0273 0.0281 0.0276 0.0288 0.0275 0.0307 0.0292 0.0358 0.0340 0.8 0.0280 0.0273 0.0283 0.0273 0.0287 0.0271 0.0284 0.0273 0.0301 0.0286 0.9 0.0281 0.0271 0.0281 0.0275 0.0282 0.0270 0.0281 0.0271 0.0285 0.0273 1.0 0.0282 0.0271 0.0282 0.0273 0.0282 0.0271 0.0282 0.0270 0.0279 0.0273

No significant inflation with 0.0270. Largest observed 0.049994 (at 12/0.2; same location as reported by Montague).

❝ IMHO using a different but fixed seed or a single fixed seed doesn't make a big difference if we simulate with 1E6 sims.

❝ Setting setseed=FALSE in Power2Stage would have the drawback of differing results even for one scenario (CV, n1) if run again.

Agree.

❝ BTW: What is Anders algo?

The linear regression of type I errors vs. α_adj in order to find α_adj leading to TIE 0.05 (implemented in the second step of my code). I borrowed the idea from his 2011 paper you quoted. I’m afraid regulators likely will not accept his method of data-driven adjusting α in the interim step (EMA: “The plan to use a two-stage approach must be pre-specified in the protocol along with the adjusted significance levels to be used for each of the analyses”).

❝ His C programs?

I have just a few of them. Given your last improvements Power2Stage is almost as fast as his compiled stuff. :smoke:

PS: Seems to be a popular topic. ~100 visits / day so far.…

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2014-10-13 16:53
(3840 d 22:31 ago)

@ d_labes
Posting: # 13692
Views: 11,450

Pocock’s “natural constant”

Post reply

Dear Detlew,

have you every wondered where the magick 0.0294 comes from? In R we can do better, of course:

require(mvtnorm)

alpha   <- 0.05

mu      <- c(0, 0)

sigma   <- diag(2)

sigma[sigma == 0] <- 1/sqrt(2)

z       <- qmvnorm(1-0.05002333, tail="both.tails", mean=mu, sigma=sigma)$quantile

p0      <- pmvnorm(lower=rep(-z, 2), upper=rep(z, 2), mean=mu, sigma=sigma)[1]


C       <- qmvnorm(1-alpha, tail="both.tails", mean=mu, sigma=sigma)$quantile

p1      <- pmvnorm(lower=rep(-C, 2), upper=rep(C, 2), mean=mu, sigma=sigma)[1]


alpha1 <- 2*(1-pnorm(C))

head   <- "%s"

body   <- "%s  %-10.7g  %-8.7g  %-8.7g  %+.7f%s"

title1 <- "\nPocock (1977) Table 1. \u03b1 = 0.05??\n"

title2 <- "N    \u03b1\u2019          z         \u03b1            RSE\n"

cat(paste0(

  sprintf(head, title1),

  sprintf(head, title2),

  sprintf(body, "2", 0.0294, 2.178, 1-p0, 100*(1-p0-0.05)/0.05, "%\n"),

  sprintf(body, "2", alpha1, C, 1-p1, 100*(1-p1-0.05)/0.05, "%\n")))



Pocock (1977) Table 1. α = 0.05??

N    α’          z         α            RSE

2  0.0294      2.178     0.05002322  +0.0464379%

2  0.02938572  2.178273  0.04999989  -0.0002221%

Nitpicking as usual. Compare at the location of maximum inflation …

Method C: alpha0= 0.05, alpha (s1/s2)= 0.02938572 0.02938572 

Futility criterion Nmax= Inf

CV= 0.2; n(stage 1)= 12; GMR= 0.95

BE margins = 0.8 ... 1.25

GMR= 0.95 and mse of stage 1 in sample size est. used



1e+06 sims at theta0= 1.25 (p(BE)='alpha').

p(BE)   = 0.051252

p(BE) s1= 0.035768

pct studies in stage 2= 78.86%



Distribution of n(total)

- mean (range)= 22.9 (12 ... 98)

- percentiles

 5% 50% 95%

 12  22  40

… with the usual stuff:

Method C: alpha0= 0.05, alpha (s1/s2)= 0.0294 0.0294 

Futility criterion Nmax= Inf

CV= 0.2; n(stage 1)= 12; GMR= 0.95

BE margins = 0.8 ... 1.25

GMR= 0.95 and mse of stage 1 in sample size est. used



1e+06 sims at theta0= 1.25 (p(BE)='alpha').

p(BE)   = 0.051247

p(BE) s1= 0.035777

pct studies in stage 2= 78.86%



Distribution of n(total)

- mean (range)= 22.9 (12 ... 98)

- percentiles

 5% 50% 95% 

 12  22  40

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

ElMaestro ★★★ Denmark, 2014-10-13 17:30 (3840 d 21:54 ago) @ Helmut Posting: # 13693 Views: 11,110	Pocock’s “natural constant” Post reply
	Hi Hötzi, ❝ `N α’ z α RSE` ❝ `2 0.0294 2.178 0.05002322 +0.0464379%` ❝ `2 0.02938572 2.178273 0.04999989 -0.0002221%` It's a scandal!! Please contact the editor in chief and demand an immediate erratum — Pass or fail! ElMaestro

d_labes ★★★ Berlin, Germany, 2014-10-14 10:56 (3840 d 04:28 ago) @ Helmut Posting: # 13695 Views: 11,070	Another “natural constant”? Post reply
	Dear Helmut, ❝ `require(mvtnorm)` Require or not require, that's the question . ❝ `alpha <- 0.05` ❝ `mu <- c(0, 0)` ❝ `sigma <- diag(2)` ❝ `sigma[sigma == 0] <- 1/sqrt(2)` ❝ `z <- qmvnorm(1-0.05002333, tail="both.tails", mean=mu, sigma=sigma)$quantile` Another magick number? — Regards, Detlew

Helmut
★★★

Vienna, Austria,
2014-10-14 15:36
(3839 d 23:48 ago)

@ d_labes
Posting: # 13701
Views: 11,136

Λ

Post reply

Dear Detlew,

❝ ❝ require(mvtnorm)

❝ Require or not require, that's the question :cool: .

Oh, I didn’t know that. I assumed that require() attachs a package only if is not already attached.
Should have RTFM, which states: “require is designed for use inside other functions …”

❝ ❝ z <- qmvnorm(1-0.05002333, tail="both.tails", mean=mu, sigma=sigma)$quantile

❝

❝ Another magick number? :-D

Exactly. In Pocock’s paper both z and α’ are rounded (2.178, 0.0294). I had to introduce my magick number in order to end up with 2.178.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

How to find a suit­able adjusted α? [Two-Stage / GS Designs]

Suitable code for suit­able adjusted α

Handling inflation

Potvin revis(it)ed

Potvin revis(it)ed

Montague revis(it)ed

Pocock’s “natural constant”

Pocock’s “natural constant”

Another “natural constant”?

Λ

How to find a suitable adjusted α? [Two-Stage / GS Designs]

Suitable code for suitable adjusted α