Bioequivalence and Bioavailability Forum • Data for 2nd stage of Potvin’s designs

BE-proff
●

2017-02-18 07:40
(2623 d 03:29 ago)

Posting: # 17078
Views: 14,071

Data for 2nd stage of Potvin’s designs [Two-Stage / GS Designs]

Hi All,

I have some questions on Potvin's designs: ;-)

1) is it correct that GMR and CV are required for sample size calculation for 2nd stage in all Potvin's designs?

2) Why Method C is considered "better" for sponsors than Method B?

The 1st question arised because Potvin's article dated 2007 says that only CV is needed for A and B while Method C uses both GMR and CV.

But on the other hand there are also opinions that all methods require GMR and CV.

Unclear.... :confused:

ElMaestro
★★★

Denmark,
2017-02-18 11:20
(2622 d 23:49 ago)

@ BE-proff
Posting: # 17079
Views: 13,224

Data for 2nd stage of Potvin’s designs

Post reply

Hi BEproff,

❝ 1) is it correct that GMR and CV are required for sample size calculation for 2nd stage in all Potvin's designs?

All calculation of sample size requires that CV and GMR be plugged in. You have a choice between using the observed GMR or a fixed GRM like 0.95. There is a lot of confusion about it, but the short version is: Two stage methods in all known forms only behave well if the true GMR is controlled. Therefore methods using GMR=0.95 (Potvin B and C and more) behave well when that criterion is true. Performance easily become abysmally bad if you are not in control of the GMR.

❝ 2) Why Method C is considered "better" for sponsors than Method B?

It is in the eye of the beholder. B has a bit lower observed alpha inflation than C. Asking why an authority does prefers C to B, or why an airline passenger prefers beef over chicken, is not productive. :-D

❝ The 1st question arised because Potvin's article dated 2007 says that only CV is needed for A and B while Method C uses both GMR and CV.

Both use GMR=0.95 regardless of observed GMR.

Yes, all this is bloody confusing and it is easy to take the wrong decisions in this area. If you just remember the sentence in red above and base your decisions on it, you'l be more or less fine.

—
Pass or fail!
ElMaestro

BE-proff ● 2017-02-18 14:29 (2622 d 20:40 ago) @ ElMaestro Posting: # 17083 Views: 13,114	Data for 2nd stage of Potvin’s designs Post reply
	Hi ElMaestro, What do you mean under controlled GMR - being within 0.95-1.05? Correct? So, if 1 stage of any method shows GMR 1.19 which figure shoud be taken for 2 stage: 0.95 or 1.19

Helmut
★★★

Vienna, Austria,
2017-02-18 17:23
(2622 d 17:47 ago)

@ BE-proff
Posting: # 17085
Views: 13,364

GMR = fixed!

Post reply

Hi BE-proff,

❝ What do you mean under controlled GMR […]

Let me answer for our ol’ Capt’n: The GMR in the estimation of interim power (stage 1) and in the sample size estimation for the second stage is fixed (and not the observed one).

❝ - being within 0.95-1.05? Correct? ;-)

Nope. The observed one can be anything. For Potvin’s B and C it is 0.95 (or 1/0.95 if you prefer). Other methods use other fixed GMRs (the A in my figures below). Look them up in the publications (AFAIK, only methods with 0.95 and 0.90 are published).

❝ So, if 1 stage of any method shows GMR 1.19 which figure shoud be taken for 2 stage: 0.95 or 1.19 :confused:

For most methods 0.95. If you already expect a “bad” GMR, you could opt for Montague’s or one of Anders’ methods which use a fixed GMR of 0.90 (or 1/0.90). Alternatively you could work with one of the methods with a futility criterion (stopping in stage 1).
Only fully adaptive methods (e.g., by Karalis and Macheras) would use the observed GMR 1.19. See my paper mentioned below why this might not be a good idea…*

For beginners… Of course, the original methods can be tweaked in such a way that the power is sufficient. But that requires exhaustive simulations.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Helmut
★★★

Vienna, Austria,
2017-02-18 12:51
(2622 d 22:19 ago)

@ BE-proff
Posting: # 17081
Views: 13,636

“Type 1” slightly higher power than “Type 2” for the same adj. α

Post reply

Hi BE-proff,

I agree with ElMaestro.

❝ 2) Why Method C is considered "better" for sponsors than Method B?

Unfortunately there is an „inflation” of letters denoting methods.
Therefore, I suggested* to use “Type 1” (B, E, …) and “Type 2” (C, D, C/D, F, …) instead.

“Type 1”

[image]

“Type 2”

[image]

In “Type 2” TSDs the conventional (unadjusted) α 0.05 may be used in the first stage (dependent on interim power). Hence, under certain conditions you have a decent chance to stop already in the first stage with no sample size penalty (due to the mandatory adjusted α in “Type 1” TSDs).

Potvin et al. recommended Method C over B due to its higher power. Examples (power by the noncentral t-approximation):

n1 CV (%) B C 12 10 0.97697 0.98858 24 20 0.88046 0.90882 36 30 0.83704 0.84676 48 40 0.82901 0.82838 60 50 0.82477 0.82405

Schütz H. Two-stage designs in bioequivalence trials. Eur J Clin Pharmacol. 2015;71(3):271-81. doi:10.1007/s00228-015-1806-2

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

BE-proff ● 2017-02-18 14:31 (2622 d 20:39 ago) @ Helmut Posting: # 17084 Views: 13,094	“Type 1” slightly higher power than “Type 2” for the same adj. α Post reply
	Hi Helmut, Rather risky - such methods must be present in our guides otherwise our designs will be rejected...

Helmut
★★★

Vienna, Austria,
2017-02-18 17:27
(2622 d 17:43 ago)

@ BE-proff
Posting: # 17086
Views: 13,013

Terminology

Post reply

Hi BE-proff,

❝ Rather risky - such methods must be present in our guides otherwise our designs will be rejected...

“Type 1” or “Type 2” was my proposal to introduce an unambiguous terminology. Get the paper at sci-hub. ;-)

On the contrary the Russian guideline (copypasted from the EMA’s) is ambigous (“For example, using 94.12% confidence intervals […] would be acceptable, but there are many acceptable alternatives and the choice of how much alpha to spend at the interim analysis is at the company's discretion”).

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Yura ★ Belarus, 2017-02-20 12:28 (2620 d 22:41 ago) @ Helmut Posting: # 17088 Views: 12,930	Terminology Post reply
	Hi All, If the ratio of AUC and Cmax T / R beyond 0.95 - 1.05, whether it is a violation of the conditions of the calculation for algorithm of adaptive design?

ElMaestro
★★★

Denmark,
2017-02-20 12:46
(2620 d 22:23 ago)

@ Yura
Posting: # 17089
Views: 13,108

Which GMR to plug in

Post reply

Hi Yura,

there are not that many people working with these designs. But those who do have all tried to plug in the observed GMR from stage 1 for sample size calculation, rather than 0.95 as in Potvin B & C.

The result is strikingly bad news: The chance that you have a greater departure from 0.95 goes up as sample size in stage 1 goes down, so you easily end up in scenarios where you need 800 subjkects in stage 2 if you apply the observed GMR, and this may happen even if the true GMR is 0.95 or better. Of course you can put a cap on max sample size, but you are punished on power.

Heartbreaking, really!! :crying:

As stated before: At a time when you do not know the true GMR very well (such as after the first stage) it is not a particularly good idea to base decisions on it (such as final sample size).

That is why Potvin's method are great for formulations with known and controlled GMR, but not great for new formulations where you don 't know how they match. Two-stage designs are useful for unknown CV's and not much else, at least in the presrnt form.

Pilot trials suffer the exxact same issue. "They are better than nothing" is a sentence I have heard a few times, but it is often not the case. Depending on how you use the info available the info you may well decide wrongly and be punished.

—
Pass or fail!
ElMaestro

BE-proff ● 2017-02-21 12:35 (2619 d 22:35 ago) @ ElMaestro Posting: # 17094 Views: 12,911	Which GMR to plug in Post reply
	Hi ElMaestro, I suppose you are talking about futility studies as a alternative for adaptive designs. Why are futility designs more popular if risks are similar?

ElMaestro
★★★

Denmark,
2017-02-21 12:47
(2619 d 22:22 ago)

@ BE-proff
Posting: # 17095
Views: 12,893

Which GMR to plug in

Post reply

Hi BE-proff,

❝ I suppose you are talking about futility studies as a alternative for adaptive designs.

Errr.... what????

❝ Why are futility designs more popular if risks are similar? :confused:

Come again, what does this mean?
What is a futility design, futility study, how are they alternatives?

I meant to say that plugging in the observed GMR has a bunch of drawbacks, which often are plain showstoppers. The sample size explosion can be fixed by futility caps on sample size, and that of course keeps the sample size (and cost) down, but then power may be so low that the trial makes no sense and is nothing more than unethical exposure.

2-stage trials work well then you are absolutely sure about the match and only unsure about the variability.

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2017-02-22 13:03
(2618 d 22:07 ago)

@ Yura
Posting: # 17096
Views: 12,807

Validated frameworks; observed GMR not relevant

Post reply

Hi Yura,

❝ If the ratio of AUC and Cmax T / R beyond 0.95 - 1.05, whether it is a violation of the conditions of the calculation for algorithm of adaptive design? :confused:

Adjusted alphas of the published frameworks are only valid for certain ranges of n₁/CV-combinations, fixed GMRs, and target powers assessed (see this presentation, slide 20). F.i. Potvin’s α_adj 0.0294 in ‘Method B’ (“Type 1”) is valid for n₁ 12–60, CV 10–100%, fixed GMR 0.95, and target power 80%. The maximum Type I Error is generally seen at small stage 1 sample sizes and low CVs. Hence, even if the n₁ and/or CV were outside the validated range on the upper end (say for ‘Method B’ >60 and/or >100%) you can be pretty sure that the patient’s risk is still controlled. However, in such a case picky assessors might ask for simulations.
I would avoid performing the first stage in 12 subjects. Due to dropouts one may end up outside the validated range. Example for ‘Method B’:

library(Power2Stage) power.2stage(CV=0.2, n1=12, alpha=c(0.0294, 0.0294), theta0=1.25, targetpower=0.8, pmethod="shifted", nsims=1e6)$pBE # [1] 0.046352 #n₁ within validated range: TIE <0.05.
power.2stage(CV=0.2, n1=10, alpha=c(0.0294, 0.0294), theta0=1.25, targetpower=0.8, pmethod="shifted", nsims=1e6)$pBE # [1] 0.048389 #n₁ outside validated range: higher TIE but still <0.05.

The GMR observed in the first stage is not relevant.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Silva
☆

Portugal,
2017-03-09 02:26
(2604 d 08:43 ago)

@ Helmut
Posting: # 17143
Views: 12,522

Validated frameworks; observed GMR not relevant

Post reply

Hi Helmut

Trying to learn technical issues of TSD.

In the example you gave:

library(Power2Stage)

 power.2stage(CV=0.2, n1=12, alpha=c(0.0294, 0.0294), theta0=1.25,

              targetpower=0.8, pmethod="shifted", nsims=1e6)$pBE

 # [1] 0.046352

# n₁ within validated range: TIE <0.05.

why the use of theta0 as 1.25 and not as GMR value? I understand the use of GMR of 0.95 for Potvin B method (as it was validated with this assumption), but don´t understant the meaning of theta0.

According to Power2Stage manual, theta 0 corresponds to the True ratio of T/R for simulating. (Defaults to the GMR argument if missing).

What is the meaning of "True ratio of T/R for simulating" and what is the difference for GMR?

d_labes
★★★

Berlin, Germany,
2017-03-09 10:21
(2604 d 00:48 ago)

@ Silva
Posting: # 17144
Views: 12,411

GMR, theta 0 and that all

Post reply

Dear Silva,

❝ What is the meaning of "True ratio of T/R for simulating" and what is the difference for GMR?

To obtain a value of the type 1 error or the power via simulations one has to create data for many studies, say 1 Mio, perform the TSD framework (using GMR=0.95 f.i., the A in Helmut's schemes above) on that data and count all studies in which BE was decided.

To create the study data you use the "True ratio of T/R" and the "True CV" and the statistical distributions relating these parameters to the observed outcomes of a study.

True ratio = 1.25 will create studies which are under the Null hypothesis "bioinequivalence" and counting studies which decide here nevertheless BE gives you the type I error "Null hypothesis true but decided as false".

Hope this was understandable.

—
Regards,

Detlew

Silva
☆

Portugal,
2017-03-09 13:38
(2603 d 21:31 ago)

@ d_labes
Posting: # 17145
Views: 12,492

GMR, theta 0 and that all

Post reply

Dear d_labes
Many thanks for your explanations. So just to clarify my mind…
Using theta0 in power2stage as 0.8 or 1.25, I’m informing the system that, after studies have been simulated based on expected GMR, n1, CV and target power, test product is truly non bioequivalent, because true T/R is 0.8 or 1.25 and therefore the respective 90% CI will always be outside the [0.80.1.25] bioequivalence range.
The algorithm will then calculate the number of simulated studies that wrongly rejected the Null Hypothesis and divide this number by the total number of simulated studies. The ratio represents TIE.

Considering a study design under Potvin’s method B framework (type I TSD), i.e. and expected GMR of 0.95, an n1 between 12 and 60 subjects, a CV between 10 and 100% and a target power of 0.8, no simulations are required if, on the end of the study, GMR was 0.95 and CV between 10 and 100%. Am I thinking appropriately?
But if expected GMR is for example 0.91, n1 between 12 and 60 subjects, CV between 10 and 100% and target power is 0.8, there is a violation of method B assumptions, right?
And therefore simulations are needed based on true data at the end of trial in order to calculate if TIE was below the nominal alpha of 0.05. So, assuming a final GMR of 0.91, a CV of 34%, n1 = 16, a target power of 0.8, and no futility rule, power2stage simulation conditions would be:

power.2stage(method = c("B"), alpha0 = 0.05, alpha = c(0.0294, 0.0294),n1=16, GMR=0.91, CV=0.34, targetpower = 0.8, pmethod = c("nct"), usePE = FALSE, Nmax = Inf, min.n2=0, theta0=0.8, theta1=0.8, theta2=1.25, npct = c(0.05, 0.5, 0.95), setseed = TRUE, details = TRUE)

With this simulation scenario:

1e+05 sims. Stage 1 - Time consumed (secs):

   user  system elapsed 

    0.4     0.0     0.4 

Keep calm. Sample sizes for stage 2 (98482 studies)

will be estimated. May need some time.

Time consumed (secs):

   user  system elapsed 

    1.3     0.0     1.3 

Total time consumed (secs):

   user  system elapsed 

      2       0       2 



Method B: alpha (s1/s2) = 0.0294 0.0294 

Target power in power monitoring and sample size est. = 0.8

BE margins = 0.8 ... 1.25

CV = 0.34; n(stage 1)= 16; GMR = 0.91

GMR = 0.91 and mse of stage 1 in sample size est. used

Futility criterion Nmax = Inf



1e+05 sims at theta0 = 0.8 (p(BE)='alpha').

p(BE)    = 0.04385

p(BE) s1 = 0.01512

Studies in stage 2 = 98.48%



Distribution of n(total)

- mean (range) = 100.5 (16 ... 332)

- percentiles

 5% 50% 95% 

 46  96 170

Based on this results:

Type I error with the 2 stages was 0.04385 (and therefore <0.05) for the 1e+05 simulated studies
Type I error for the first stage was 0.01512
98.48% of the simulated studies went into stage 2. The other 1.52% of the studies ended on stage 1 (either as BE or non BE)

Am I interpreting correctly this results?
Best rgs and thks for all the patience!

ElMaestro
★★★

Denmark,
2017-03-09 13:56
(2603 d 21:14 ago)

@ Silva
Posting: # 17146
Views: 12,415

GMR, theta 0 and that all

Post reply

Hi Silva,

I am not d_labes or Helmut but I have an opinion, too, and you are asking some bloody good questions there :-)

❝ The algorithm will then calculate the number of simulated studies that wrongly rejected the Null Hypothesis and divide this number by the total number of simulated studies. The ratio represents TIE.

Actually I'd prefer to say it is the maximum type 1 error. The type 1 error is the chance of concluding BE for a product that isn't BE, which is when the true ratio is outside the acceptance range. So when we work with 80.00%-125.00% a product is truly not BE when the true ratio is 72%, 77%, 79%. But these three levels will be associated with different levels of power. Regardless of how a product is inequivalent we aim for methods that give a maximum type 1 error of 5%. By way of the nature of the game, the type 1 error becomes smaller as we part further from one of the limits.

❝ Considering a study design under Potvin’s method B framework (type I TSD), i.e. and expected GMR of 0.95, an n1 between 12 and 60 subjects, a CV between 10 and 100% and a target power of 0.8, no simulations are required if, on the end of the study, GMR was 0.95 and CV between 10 and 100%. Am I thinking appropriately?

For any approved method I know of it does not have anything to do with what the GMR or CV was at the end of the study as long as you showed BE.

❝ But if expected GMR is for example 0.91, n1 between 12 and 60 subjects, CV between 10 and 100% and target power is 0.8, there is a violation of method B assumptions, right?

No. See above.

❝ And therefore simulations are needed based on true data (...)

No, this isn't exactly how it works. The "true data" you refer to are your observations - they give an estimate but they do not give you the true ratio.

—
Pass or fail!
ElMaestro

d_labes
★★★

Berlin, Germany,
2017-03-09 14:55
(2603 d 20:15 ago)

@ Silva
Posting: # 17147
Views: 12,563

GMR, theta0 and that all

Post reply

Dear Silva,

additionally to what our great Maestro said:

You have to know beforehand if your settings you use in a TSD i.e. adj. alpha, fixed GMR, targetpower, n1, inclusion of futility rules or other are suszeptible to an alpha-inflation over a range of reasonable true CV.
Potvin et.al. have that shown for adj. alpha = 0.0294, targetpower 0.8 and n1 12-60 for a range of true CVs of 10-100% by simulations, at least for the Type 1 (aka method B) decision scheme. For Type 2 (aka method C) decision scheme some guys believe they have seen an alpha-inflation in the numbers given.

Anders the Great has derived adj. alpha values for other settings of the fixed GMR and targetpower preserving also the maximum TIE <=0.05.

Fuglsang A. Controlling type I errors for two-stage bioequivalence study designs. Clin Res Regul Aff. 2011;28(4):100–5. doi 10.3109/10601333.2011.631547
Fuglsang A. Sequential Bioequivalence Trial Designs with Increased Power and Controlled Type I Error Rates. AAPS J. 2013;15(3):659–61. doi 10.1208/s12248-013-9475-5

If you change any of these settings you have to show again that no alpha-inflation is to be expected. By simulations (... although there is some rumor that some leading regulatory agencies don't like simulations).
And that is to what end package Power2Stage was invented.

The observed GMR(s) and CV in your actual study don't play a role in that game. Not known beforehand ;-)

.
One exception: If you observe a CV > the 'validated' range it may be wise to do simulations with that CV assumed as TRUE.

—
Regards,

Detlew

Silva ☆ Portugal, 2017-03-09 19:01 (2603 d 16:09 ago) @ d_labes Posting: # 17148 Views: 12,387	GMR, theta0 and that all Post reply
	Dear el_maesto and d_labes Many thanks for your reply and elucidations. I'll read carefuly Fuglsang's paper (2011). I've already studied the other two papers from the same author (2013 and 2014), as well as Helmut's nice review paper. Best rgds

BE-proff ● 2017-02-21 12:31 (2619 d 22:38 ago) @ Helmut Posting: # 17093 Views: 12,826	“Type 1” slightly higher power than “Type 2” for the same adj. α Post reply
	Hi Helmut, Thank you for clarification!