d_labes
★★★

Berlin, Germany,
2010-10-05 13:09

Posting: # 5997
Views: 11,428

## 2-stage design with interim sample size estimation [Two-Stage / GS Designs]

Dear all,

I have been questioned about the following design:

The sponsor/experimenter has no sufficient information about CV and point estimate ('true' value) to be able to make an "educated" guess of the sample size for a BE study. Some distinct shift in the point estimator is expected.

To save resources (its a fixed combination drug with 3 constituents, analytical expensive) he is not willing to do a pilot study and based on this a pivotal with BE decision not using the data from the pilot.

Thus the design shall be a 2-stage design. A first group of subjects (conception is 8-12) shall be dosed, analyzed and statistical evaluated. Based on the data (CV and point estimate) a final sample size shall be derived which will be utilized for the second stage.

There is the idea of a formal extreme small alpha to spend after the first stage =0.001 (to conform EMA guideline). This will ensure with great probability that the second stage will be reached. Practical no stopping will occur by that.

The only stopping rule is a maximum sample size constraint Nmax the sponsor is willing to fund. If the estimated sample size after stage 1 is greater the second stage will not performed.

Any body out there who knows how to handle such a design?
Is there something to do to protect the overall alpha (lower nominal alpha at second stage)?
Is the sample size estimation done as usually?

Any input will be highly appreciated.

Regards,

Detlew
Jack
☆

Lancaster, United Kingdom,
2010-10-25 16:20

@ d_labes
Posting: # 6077
Views: 10,356

## 2-stage design with interim sample size estimation

Dear d_labes,

this is quite a complex question to which I will add some more questions.

The first and most important question in my mind is why not consider a single stage design with a sample size review at interim? So basically not allowing for the study to stop early (besides the sponsors constraint).

The second choice one needs to make is if the sample size review is going to be blinded or not (note that I am ignoring any knowledge about a possible shift in point estimates). Ekkehard Glimm has (very recently) done some work on the type-I error inflation in blinded sample size reviews for non-inferiority trials and from memory of a presentation I saw the conclusion was that:

a) the inflation is negligible for all practical purposes.
b) that the usual sample size review does in fact make use of the treatment allocation internally.

The trouble is that the results are not yet published as far as I know.

As for, how blinded sample size reestimation is done in non-inferiority trials see for example Biom J. 2007 Dec;49(6):903-16.

Hope that helps
Helmut
★★★

Vienna, Austria,
2010-10-25 17:16

@ Jack
Posting: # 6078
Views: 10,517

## Null = bioINequivalence

Dear Jack!

» this is quite a complex question…

True.

» … to which I will add some more questions.

Well, I think the applicability of commonly used methods is questionable, because of the different way the Null is formulated in BE studies.

Just a quote from Potvin et al.,* after referring different approaches:

Regardless, none of the methods are validated for crossover studies and two one-sided t-tests, so there is a need to start from the beginning in considering these approaches.

• Potvin D, Diliberti CE, Hauck WW, Parr AF, Schuirmann DJ, Smith RA. Sequential design approaches for bioequivalence studies with crossover designs. Pharm Stat. 2008;7(4):245–62. doi:10.1002/pst.294.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2010-10-26 11:28
(edited by d_labes on 2010-10-26 11:45)

@ Jack
Posting: # 6080
Views: 10,371

## 2-stage design with interim sample size estimation

Dear Jack,

» this is quite a complex question ...

» The first and most important question in my mind is why not consider a single stage design with a sample size review at interim? So basically not allowing for the study to stop early (besides the sponsors constraint).

This was the intention of the design described. But why do you call it single stage?

The reason behind it was that for some constituent (its a FDC) the variability is not known at all. Each assumption about is only a delusion. Not to waste resources the pilot, usually performed in such circumstances, should included in the study (sometimes called internal pilot).

The 'extreme' alpha to spend at stage 1 is only for compliance with the new EMA guidance which calls definitely for it, statistical sound or not. Just to cite page 16 to 17:
"Two-stage design
... If this approach is adopted appropriate steps must be taken to preserve the overall type I error of the experiment and the stopping criteria should be clearly defined prior to the study. The analysis of the first stage data should be treated as an interim analysis and both analyses conducted at adjusted significance levels (with the confidence intervals accordingly using an adjusted coverage probability which will be higher than 90%) ..."

» The second choice one needs to make is if the sample size review is going to be blinded or not (note that I am ignoring any knowledge about a possible shift in point estimates).

I'm not sure about this. Usually BE studies are performed with administration of the study medications in an open fashion. Why then make a blinded evaluation at interim?

The interim sample size estimation is highly influenced by a shift in the point estimator. Especially the futility bound: Can we expect to prove BE with a Nmax of 120 subjects with some pre-specified power?
An a-priori difference in Cmax is expected due to the nature of the study products. Not considering it is like "Mit offenen Augen gegen eine Wand fahren" (to drive against a wall with open eyes).

I'm on the same point of view as Helmut above. All I've read in the meantime is for group-sequential or adaptive designs for superiority trials with parallel groups, or at best some rare papers for non-inferiority.
Since the role of the Null and the Alternative are reversed in equivalence trials compared to superiority it is likely that the results obtained do not apply.

Regards,

Detlew
Jack
☆

Lancaster, United Kingdom,
2010-10-26 16:26

@ d_labes
Posting: # 6083
Views: 10,280

## 2-stage design with interim sample size estimation

Dear d_labes,

the reason I call this a one-stage design is that I would not formally test for equivalence at the time were one does the sample size re-estimation. Hence one would not need to spend any alpha at this time (as no testing is done and hence the EMA guide is not relevant) and hence no risk of the awkward situation that one could have to stop the trial early.

This is also the reason why doing the sample size re-estimation in a blinded fashion can make sense as using unblinded data will question the one-stage design.

As for the question: "Can we expect to prove BE with a Nmax of 120 subjects with some pre-specified power?"

I dont really see the problem there. You are not arriving at a test decision about BE if you decide to stop because you would need more resources than you have and hence it is not impacting your type-I-error. Having such a rule has therefore no bearing on the formal testing.

Finally, I agree that results from superiority trials will not apply directly they do, however, often contain useful ideas how one can tackle a problem in equivalence from first principle.
d_labes
★★★

Berlin, Germany,
2010-10-27 14:01

@ Jack
Posting: # 6090
Views: 10,270

## Blind or not, Indecisions or Decisions

Dear Jack,

» the reason I call this a one-stage design is that I would not formally test for equivalence at the time were one does the sample size re-estimation. Hence one would not need to spend any alpha at this time (as no testing is done and hence the EMA guide is not relevant) and hence no risk of the awkward situation that one could have to stop the trial early.

I wonder if the EMA will seeing this the same way. Especially claiming that the EMA guideline is not relevant .

» This is also the reason why doing the sample size re-estimation in a blinded fashion can make sense as using unblinded data will question the one-stage design.

The questions is why? Can you explain why an unblinded evaluation will have any influence?
All involved personal in an open study is unblinded. But the statistician should have a blindfold (this or better this one )?

» As for the question: "Can we expect to prove BE with a Nmax of 120 subjects with some pre-specified power?"
»
» I dont really see the problem there. You are not arriving at a test decision about BE if you decide to stop because you would need more resources than you have and hence it is not impacting your type-I-error.

Do you mean that the stopping the study due to exceeding Nmax is not a decision about BE? I have thought up to now that this is equal to the decision: Given the data of the 'internal' pilot we can not expect to reject the Null: Bioinequivalence with reasonable resources. Thus we have to stay with the Null (aka accept H0). In that direction I had understood hitting futility bounds in group sequential trials.
Or am I here totally wrong?

This is crucial in implementing a Monte Carlo simulation in which we would count 'BE' / 'not BE' to establish the type I and type II errors. As what shall I count hitting the futility criterion? 'Not BE' or a third answer? If you prefer a third, how to count them with respect to type I and type II errors (nominator/denominator)?

Regards,

Detlew
Jack
☆

Lancaster, United Kingdom,
2010-10-27 16:14

@ d_labes
Posting: # 6091
Views: 10,284

## Blind or not, Indecisions or Decisions

Dear d_labes,

The question if "exceeding Nmax" has any influence on type-I-error is solely a question of how that rule was specified when designing the trial. If it is interpreted as "test statistic at interim is too small" then you have used the statistical test to come to that conclusion and hence it will have an influence on your type-I-error. More precisely you have used a statistical test that gauges how likely it is that you will get a "positive" result in the end. Note that this is how futility bounds in sequential designs are usually derived and used.

If alternatively you have a situation were you decide to stop because of factors independent of the test statistic then it will not have an impact on the type-I-error as you have not performed a test. Examples of that include stopping a study because a competitor released a product that you don't think you can compete with, but also the senior manager that all of a sudden says: are you insane to think that we can do a trial with >Nmax patients?

This is also the reason why doing a sample size review in a blinded fashion can be sensible. Note that I am saying "in a blinded fashion" meaning use techniques that do not use unblinded information not blind the statistician. If you use unblinded techniques most likely the test statistic comes in the mix in some form making it very tricky to argue that you have not also formally tested.

Ohlbe
★★★

France,
2010-10-27 21:37

@ Jack
Posting: # 6092
Views: 10,244

## Blind ?

Dear Jack,

What's probably confusing to a number of people following this thread is what you exactly mean with "blind". In clinical trials this is usually meant as not knowing which treatment is administered to which patient/subject. In a BE trial this usually only applies to the bioanalytical lab. You can't run the statistical analysis without unblinding (even if you call the products A and B and don't tell the statistician which is the test and which is the reference, it won't make any difference).

So to keep it short, a question from a non-statistician: what do you exactly mean by blinded in this particular context ? Blinded to what ?

Regards
Ohlbe

Regards
Ohlbe
Jack
☆

Lancaster, United Kingdom,
2010-10-28 08:24

@ Ohlbe
Posting: # 6095
Views: 10,208

## Blind ?

Dear Ohlbe,

using a blinded method means that the statistical method used does not use information on treatment groups (even if it may be available). As a simplified example the method can used the mean across group, but not the means within group.

The reason for using such a method has only to do with the formal test decision and therefore the type-I-error. This is in contrast to "traditional" trials were blinding (meaning noone or as few people as possible know which subject receives which treatment) is used to ensure that no bias arises from knowing the treatment. No attempt to prevent bias is made here (since people involved know the treatment).
Astea
★★

Russia,
2019-05-30 08:37

@ Jack
Posting: # 20305
Views: 3,422

## Blind 2Stage

Dear Smart People!

Digging forum posts I've found this 9-year old question from Detlew. It turns out that the question can be now easily answered by the author after constructing Power2Stage.
But can you clarify me the meaning of blindness in BE trials?

I naively thought that blind interim analyses should not influence the type-one error. But
comparing power.tsd.ssr(n1=10,CV=0.1,blind=FALSE,theta0=1.25) and power.tsd.ssr(n1=10,CV=0.1,blind=TRUE,theta0=1.25) it turns out that blinding may even worse the situation (5.01% vs 7.36%, TIE). It is connected with the unknown PE (cause for BLIND=TRUE s20s<-mses), but isn't it contrintuitive?

Can it be true for parallel design also? Suppose we want to make a blind interim analyses after first stage with N subjects and recalculate sample size on fixed GMR=0.95 if total CV would be greater than initial suggestion. Will it cause any inflation? How to estimate it?

"Being in minority, even a minority of one, did not make you mad"
Helmut
★★★

Vienna, Austria,
2019-05-30 10:56

@ Astea
Posting: # 20306
Views: 3,406

## Blind 2Stage

Hi Nastia,

» But can you clarify me the meaning of blindness in BE trials?

You know its meaning.
Never seen ones, except for Health Canada:

2.4.2 Blinding
To avoid study bias, comparative bioavailability studies should be conducted in such a way that the subjects are not aware of which product (test or reference) is being administered.

That’s a funny idea if the products don’t look the same.

» […] it turns out that blinding may even worse the situation (5.01% vs 7.36%, TIE). It is connected with the unknown PE (cause for BLIND=TRUE s20s<-mses), but isn't it contrintuitive?

Yes and yes. Have a look at Figure 1 of Golkowski et al.*

» Can it be true for parallel design also? Suppose we want to make a blind interim analyses after first stage with N subjects and recalculate sample size on fixed GMR=0.95 if total CV would be greater than initial suggestion. Will it cause any inflation?

Possibly.

» How to estimate it?

Simulations… Might be tricky cause we have to think about unequal group sizes and/or variances.

BTW, do you remember Paola Coppola’s presentation at last year’s BioBridges?

• Golkowski D, Friede T, Kieser M. Blinded sample size re-estimation in crossover bioequivalence trials. Pharm Stat. 2014;13(3):157–62. doi:10.1002/pst.1617.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Astea
★★

Russia,
2019-05-31 12:36
(edited by Astea on 2019-05-31 20:51)

@ Helmut
Posting: # 20308
Views: 3,371

## Blind 2Stage and an elephant

Dear Helmut!

Thank you for pointing on the article! I've tried to draw a graph similar to Figure 1 of Golkowski et al. by using power.tsd.ssr. It uses CV instead of N of the second size, but the tendency is pretty similar: low sample size of the first stage can significantly enlarge the TIE.

» Simulations… Might be tricky cause we have to think about unequal group sizes and/or variances.

Is it possible to do it with Power2Stage or some modifications? (As I understand power.tsd.p deals with unblind scheme?) The co-authors of the aforementioned article have also an article dedicated to the parlallel groups but relatively to non-inferiority trials...

"Being in minority, even a minority of one, did not make you mad"
Helmut
★★★

Vienna, Austria,
2019-06-02 18:14

@ Astea
Posting: # 20311
Views: 3,327

## Which elephant?

Hi Nastia,

I don’t get your subject line. Do you mean that Golkowski’s Figure 4

resembles Saint-Exupéry’s Drawing № 2 in Le Petit Prince (cut-off a boa constrictor showing its last meal, an elephant)?

Or are you referring to this one?

With four parameters I can fit an elephant,
and with five I can make him wiggle his trunk.
John von Neumann

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Astea
★★

Russia,
2019-06-02 23:47

@ Helmut
Posting: # 20312
Views: 3,286

## Blind interim in parallel design?

Dear Helmut!

» I don’t get your subject line. Do you mean that...

I am very sorry for confusing! (neither of them, but I really love your way of thinking!!) It has to mean "Blind men and an elephant", a parable about how difficult is to find the truth if you have only limited data . That was about parallel blind studies.

» Simulations… Might be tricky cause we have to think about unequal group sizes and/or variances.

As far as I understood from an article by A. Fuglsang*, heteroscedasticity should not cause problems. But for blind studies CVaverage (that is not distinguishing treatments) is not equal to CVpooled, how to deal with it? Is there a way to prove a possibility of inflation in blind parallel studies?

d_labes
★★★

Berlin, Germany,
2019-06-05 10:23

@ Astea
Posting: # 20317
Views: 3,214

## Blind men

Dear Astea,

» ... But for blind studies CVaverage (that is not distinguishing treatments) is not equal to CVpooled, how to deal with it? Is there a way to prove a possibility of inflation in blind parallel studies?

Not with the tools available in Power2Stage.
But others have done the Job for you.

Friede T., Kieser M.
Sample size adjustment in clinical trials for proving equivalence
Drug Information Journal, Vol. 35, pp. 1401–1408, 2001 doi:0092-8615/2001

Citing page 1407:
... the computation of the actual level of the one-sided tests can be done following the lines of Kieser and Friede (unpublished data; 2001). The basic idea is the same as in the unblinded case. However, calculations are more complex and will, therefore, be omitted here. In all situations considered in Kieser and Friede (unpublished data, 2001), the inflation of the significance level was smaller than 0.0001. Therefore, no adjustment of the nominal significance level is necessary for the two one-sided tests procedure in case of blinded sample size adjustment.

It's a little bit annoying that they reference themselves and with "unpublished data" but nevertheless ...

Regards,

Detlew
Astea
★★

Russia,
2019-06-05 22:55

@ d_labes
Posting: # 20318
Views: 3,178

## The Blind Leading the Blind

Dear Detlew!

» It's a little bit annoying that they reference themselves and with "unpublished data" but nevertheless ...

Thank you very much for the article! Ok, let us trust the authors until someone will prove it independently!

By the way, I've found another article of this series, dedicated not to equivalence but to equality, as I understood: Bristol D.R., Shurzinske L. Blinded Sample Size Adjustment. Drug Information Journal, Vol. 35, pp. 1123–1130, 2001. doi:10.1177/009286150103500409.

Disclaiment: the subject line of the post doesn't intend to confuse anyone and should refers to Pieter Bruegel the Elder and another parable about blindness and the truth :)