Old beliefs die hard [Two-Stage / GS Designs]

posted by Helmut Homepage – Vienna, Austria, 2015-05-29 20:08 (3226 d 19:17 ago) – Posting: # 14886
Views: 18,519

Hi nobody,

as a deckhand I’m not qualified to speak for our Capt’n. Only some remarks.

❝ ❝ And equally troubling noone has published the proof that using the observed GMR doesn't work. I know at least three different groups of researchers (or two groups and one individual) have been looking at it.


I would say we don’t need a proof here. There is an abundance of literature about fully adaptive methods in superiority testing (i.e., adjusting both for the effect size and vari­ance). Many papers warn about a “too early” interim analysis because the estimate(s) are not reliable enough. However, adaptive methods in Phase III are still valuables tools. Nevertheless, adjusting for two estimates has its price. There is no free lunch.

In the BE context the story is a little bit different. Given the limited sample sizes in the first stage the estimated GMR (and yes, the CV as well…) is not that good. All adaptive methods published so far contain at least one futility rule (papers by Karalis/Macheras, Kieser/Rauch 2015, and our wacky poster doi 10.13140/RG.2.1.5190.0967). A natural one is stopping after the first stage if the GMR is outside the acceptance range1. Another one is to set an upper limit to the estimated total sample size. Note that adding futi­lity rule(s) to any method decreases the type I error (since studies more likely stop). But: The impact on power may be unacceptable (see Anders’ 2013 paper and my 2015 review where I explored Karalis’/Macheras’ methods). In other words, regulators don’t have a problem (since the TIE is preserved) but such methods might be problematic for economic and ethi­cal reasons.

Example: You state a maximum total sample size of 120 in the protocol.
  1. You run some simulations beforehand and know that – contrary to what you expected from the paper(s) – the actual power might be (far) below 60%. Would you give this information to the EC risking that the protocol is rejected – or hide it and cross your fingers?
  2. You estimate the total sample size (based on the GMR and CV) with 124. According to the protocol you would have to stop and throw the first stage’s data into the waste bin. Or would you let the protocol walk the plank, write some :blahblah: like “… in order to compensate for a potentially higher dropout rate …” and continue?
  3. You look at the extensive tables of the publications or perform own simulations. You don’t state any futility rule but are aware that there is some chance that the total sample size will exceed your budget by far (in our Capt’n words: going through the roof). Internally (i.e., following the “Guy in the Armani Suit”) you know that in such a case you will stop the study (every protocol contains such a clause somewhere). Essentially you make the EC believe that the power is 80%, whereas in reality it is <60% (like in #a).
In my experience #c is quite common.

I don’t say that fully adaptive methods are futile, only that they don’t work “out of the box”. If one is able to come up with an educated guess about both the “most likely” and “worst case” CV it makes sense to perform own simulations. Since the chance to stop in the first stage is higher, less adaption of α (i.e., a narrower CI) is necessary. Whether it out­weighs the loss in power2 has to be assessed.
It is amazing that many people are so interested in the GMR in the interim. The same people had no problem to throw away failed fixed sample studies in the last 30+ years (“Too bad that our assumptions were wrong. Let’s perform another study.”)… TSDs are not a jack of all trades device.


  1. Mandatory in fully adaptive methods. Otherwise any software you are using for the sample size estimation will show you the finger.
  2. In my experience rarely (if ever).

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
Activity
 Admin contact
22,957 posts in 4,819 threads, 1,636 registered users;
116 visitors (0 registered, 116 guests [including 10 identified bots]).
Forum time: 14:26 CET (Europe/Vienna)

With four parameters I can fit an elephant,
and with five I can make him wiggle his trunk.    John von Neumann

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5