BEproff Senior Russia, 20170218 05:40 Posting: # 17078 Views: 4,968 

Hi All, I have some questions on Potvin's designs: 1) is it correct that GMR and CV are required for sample size calculation for 2nd stage in all Potvin's designs? 2) Why Method C is considered "better" for sponsors than Method B? The 1st question arised because Potvin's article dated 2007 says that only CV is needed for A and B while Method C uses both GMR and CV. But on the other hand there are also opinions that all methods require GMR and CV. Unclear.... 
ElMaestro Hero Denmark, 20170218 09:20 @ BEproff Posting: # 17079 Views: 4,663 

Hi BEproff, » 1) is it correct that GMR and CV are required for sample size calculation for 2nd stage in all Potvin's designs? All calculation of sample size requires that CV and GMR be plugged in. You have a choice between using the observed GMR or a fixed GRM like 0.95. There is a lot of confusion about it, but the short version is: Two stage methods in all known forms only behave well if the true GMR is controlled. Therefore methods using GMR=0.95 (Potvin B and C and more) behave well when that criterion is true. Performance easily become abysmally bad if you are not in control of the GMR. » 2) Why Method C is considered "better" for sponsors than Method B? It is in the eye of the beholder. B has a bit lower observed alpha inflation than C. Asking why an authority does prefers C to B, or why an airline passenger prefers beef over chicken, is not productive. » The 1st question arised because Potvin's article dated 2007 says that only CV is needed for A and B while Method C uses both GMR and CV. Both use GMR=0.95 regardless of observed GMR. Yes, all this is bloody confusing and it is easy to take the wrong decisions in this area. If you just remember the sentence in red above and base your decisions on it, you'l be more or less fine. — “A tenyear, doubleblind study from the Mayo Clinic concluded that even in late stages of dementia, the last to go is the lobe of the brain in charge of cafeteria layout.” (Serge Storms/Tim Dorsey). Best regards, ElMaestro  Bootstrapping is a relatively new hobby of mine. I am only 30 years late to the party. 
BEproff Senior Russia, 20170218 12:29 @ ElMaestro Posting: # 17083 Views: 4,594 

Hi ElMaestro, What do you mean under controlled GMR  being within 0.951.05? Correct? So, if 1 stage of any method shows GMR 1.19 which figure shoud be taken for 2 stage: 0.95 or 1.19 
Helmut Hero Vienna, Austria, 20170218 15:23 @ BEproff Posting: # 17085 Views: 4,645 

Hi BEproff, » What do you mean under controlled GMR […] Let me answer for our ol’ Capt’n: The GMR in the estimation of interim power (stage 1) and in the sample size estimation for the second stage is fixed (and not the observed one). »  being within 0.951.05? Correct? Nope. The observed one can be anything. For Potvin’s B and C it is 0.95 (or 1/0.95 if you prefer). Other methods use other fixed GMRs (the A in my figures below). Look them up in the publications (AFAIK, only methods with 0.95 and 0.90 are published). » So, if 1 stage of any method shows GMR 1.19 which figure shoud be taken for 2 stage: 0.95 or 1.19 For most methods 0.95. If you already expect a “bad” GMR, you could opt for Montague’s or one of Anders’ methods which use a fixed GMR of 0.90 (or 1/0.90). Alternatively you could work with one of the methods with a futility criterion (stopping in stage 1). Only fully adaptive methods (e.g., by Karalis and Macheras) would use the observed GMR 1.19. See my paper mentioned below why this might not be a good idea…*
— Cheers, Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. ☼ Science Quotes 
Helmut Hero Vienna, Austria, 20170218 10:51 @ BEproff Posting: # 17081 Views: 4,685 

Hi BEproff, I agree with ElMaestro. » 2) Why Method C is considered "better" for sponsors than Method B? Unfortunately there is an „inflation” of letters denoting methods. Therefore, I suggested* to use “Type 1” (B, E, …) and “Type 2” (C, D, C/D, F, …) instead. “Type 1” In “Type 2” TSDs the conventional (unadjusted) α 0.05 may be used in the first stage (dependent on interim power). Hence, under certain conditions you have a decent chance to stop already in the first stage with no sample size penalty (due to the mandatory adjusted α in “Type 1” TSDs). Potvin et al. recommended Method C over B due to its higher power. Examples (power by the noncentral tapproximation):
— Cheers, Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. ☼ Science Quotes 
BEproff Senior Russia, 20170218 12:31 @ Helmut Posting: # 17084 Views: 4,606 

Hi Helmut, Rather risky  such methods must be present in our guides otherwise our designs will be rejected... 
Helmut Hero Vienna, Austria, 20170218 15:27 @ BEproff Posting: # 17086 Views: 4,554 

Hi BEproff, » Rather risky  such methods must be present in our guides otherwise our designs will be rejected... “Type 1” or “Type 2” was my proposal to introduce an unambiguous terminology. Get the paper at scihub. On the contrary the Russian guideline (copypasted from the EMA’s) is ambigous (“For example, using 94.12% confidence intervals […] would be acceptable, but there are many acceptable alternatives and the choice of how much alpha to spend at the interim analysis is at the company's discretion”). — Cheers, Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. ☼ Science Quotes 
Yura Regular Belarus, 20170220 10:28 @ Helmut Posting: # 17088 Views: 4,412 

Hi All, If the ratio of AUC and Cmax T / R beyond 0.95  1.05, whether it is a violation of the conditions of the calculation for algorithm of adaptive design? 
ElMaestro Hero Denmark, 20170220 10:46 @ Yura Posting: # 17089 Views: 4,452 

Hi Yura, there are not that many people working with these designs. But those who do have all tried to plug in the observed GMR from stage 1 for sample size calculation, rather than 0.95 as in Potvin B & C. The result is strikingly bad news: The chance that you have a greater departure from 0.95 goes up as sample size in stage 1 goes down, so you easily end up in scenarios where you need 800 subjkects in stage 2 if you apply the observed GMR, and this may happen even if the true GMR is 0.95 or better. Of course you can put a cap on max sample size, but you are punished on power. Heartbreaking, really!! As stated before: At a time when you do not know the true GMR very well (such as after the first stage) it is not a particularly good idea to base decisions on it (such as final sample size). That is why Potvin's method are great for formulations with known and controlled GMR, but not great for new formulations where you don 't know how they match. Twostage designs are useful for unknown CV's and not much else, at least in the presrnt form. Pilot trials suffer the exxact same issue. "They are better than nothing" is a sentence I have heard a few times, but it is often not the case. Depending on how you use the info available the info you may well decide wrongly and be punished. — “A tenyear, doubleblind study from the Mayo Clinic concluded that even in late stages of dementia, the last to go is the lobe of the brain in charge of cafeteria layout.” (Serge Storms/Tim Dorsey). Best regards, ElMaestro  Bootstrapping is a relatively new hobby of mine. I am only 30 years late to the party. 
BEproff Senior Russia, 20170221 10:35 @ ElMaestro Posting: # 17094 Views: 4,397 

Hi ElMaestro, I suppose you are talking about futility studies as a alternative for adaptive designs. Why are futility designs more popular if risks are similar? 
ElMaestro Hero Denmark, 20170221 10:47 @ BEproff Posting: # 17095 Views: 4,397 

Hi BEproff, » I suppose you are talking about futility studies as a alternative for adaptive designs. Errr.... what???? » Why are futility designs more popular if risks are similar? Come again, what does this mean? What is a futility design, futility study, how are they alternatives? I meant to say that plugging in the observed GMR has a bunch of drawbacks, which often are plain showstoppers. The sample size explosion can be fixed by futility caps on sample size, and that of course keeps the sample size (and cost) down, but then power may be so low that the trial makes no sense and is nothing more than unethical exposure. 2stage trials work well then you are absolutely sure about the match and only unsure about the variability. — “A tenyear, doubleblind study from the Mayo Clinic concluded that even in late stages of dementia, the last to go is the lobe of the brain in charge of cafeteria layout.” (Serge Storms/Tim Dorsey). Best regards, ElMaestro  Bootstrapping is a relatively new hobby of mine. I am only 30 years late to the party. 
Helmut Hero Vienna, Austria, 20170222 11:03 @ Yura Posting: # 17096 Views: 4,251 

Hi Yura, » If the ratio of AUC and Cmax T / R beyond 0.95  1.05, whether it is a violation of the conditions of the calculation for algorithm of adaptive design? Adjusted alphas of the published frameworks are only valid for certain ranges of n_{1}/CVcombinations, fixed GMRs, and target powers assessed (see this presentation, slide 20). F.i. Potvin’s α_{adj} 0.0294 in ‘Method B’ (“Type 1”) is valid for n_{1} 12–60, CV 10–100%, fixed GMR 0.95, and target power 80%. The maximum Type I Error is generally seen at small stage 1 sample sizes and low CVs. Hence, even if the n_{1} and/or CV were outside the validated range on the upper end (say for ‘Method B’ >60 and/or >100%) you can be pretty sure that the patient’s risk is still controlled. However, in such a case picky assessors might ask for simulations. I would avoid performing the first stage in 12 subjects. Due to dropouts one may end up outside the validated range. Example for ‘Method B’:
The GMR observed in the first stage is not relevant. — Cheers, Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. ☼ Science Quotes 
Silva Junior Portugal, 20170309 00:26 @ Helmut Posting: # 17143 Views: 4,008 

Hi Helmut Trying to learn technical issues of TSD. In the example you gave: library(Power2Stage) # n_{1} within validated range: TIE <0.05.why the use of theta0 as 1.25 and not as GMR value? I understand the use of GMR of 0.95 for Potvin B method (as it was validated with this assumption), but don´t understant the meaning of theta0. According to Power2Stage manual, theta 0 corresponds to the True ratio of T/R for simulating. (Defaults to the GMR argument if missing). What is the meaning of "True ratio of T/R for simulating" and what is the difference for GMR? 
d_labes Hero Berlin, Germany, 20170309 08:21 @ Silva Posting: # 17144 Views: 3,985 

Dear Silva, » What is the meaning of "True ratio of T/R for simulating" and what is the difference for GMR? To obtain a value of the type 1 error or the power via simulations one has to create data for many studies, say 1 Mio, perform the TSD framework (using GMR=0.95 f.i., the A in Helmut's schemes above) on that data and count all studies in which BE was decided. To create the study data you use the "True ratio of T/R" and the "True CV" and the statistical distributions relating these parameters to the observed outcomes of a study. True ratio = 1.25 will create studies which are under the Null hypothesis "bioinequivalence" and counting studies which decide here nevertheless BE gives you the type I error "Null hypothesis true but decided as false". Hope this was understandable. — Regards, Detlew 
Silva Junior Portugal, 20170309 11:38 @ d_labes Posting: # 17145 Views: 3,966 

Dear d_labes Many thanks for your explanations. So just to clarify my mind… Using theta0 in power2stage as 0.8 or 1.25, I’m informing the system that, after studies have been simulated based on expected GMR, n1, CV and target power, test product is truly non bioequivalent, because true T/R is 0.8 or 1.25 and therefore the respective 90% CI will always be outside the [0.80.1.25] bioequivalence range. The algorithm will then calculate the number of simulated studies that wrongly rejected the Null Hypothesis and divide this number by the total number of simulated studies. The ratio represents TIE. Considering a study design under Potvin’s method B framework (type I TSD), i.e. and expected GMR of 0.95, an n1 between 12 and 60 subjects, a CV between 10 and 100% and a target power of 0.8, no simulations are required if, on the end of the study, GMR was 0.95 and CV between 10 and 100%. Am I thinking appropriately? But if expected GMR is for example 0.91, n1 between 12 and 60 subjects, CV between 10 and 100% and target power is 0.8, there is a violation of method B assumptions, right? And therefore simulations are needed based on true data at the end of trial in order to calculate if TIE was below the nominal alpha of 0.05. So, assuming a final GMR of 0.91, a CV of 34%, n1 = 16, a target power of 0.8, and no futility rule, power2stage simulation conditions would be: power.2stage(method = c("B"), alpha0 = 0.05, alpha = c(0.0294, 0.0294),n1=16, GMR=0.91, CV=0.34, targetpower = 0.8, pmethod = c("nct"), usePE = FALSE, Nmax = Inf, min.n2=0, theta0=0.8, theta1=0.8, theta2=1.25, npct = c(0.05, 0.5, 0.95), setseed = TRUE, details = TRUE) Based on this results:
Best rgs and thks for all the patience! 
ElMaestro Hero Denmark, 20170309 11:56 @ Silva Posting: # 17146 Views: 3,934 

Hi Silva, I am not d_labes or Helmut but I have an opinion, too, and you are asking some bloody good questions there » The algorithm will then calculate the number of simulated studies that wrongly rejected the Null Hypothesis and divide this number by the total number of simulated studies. The ratio represents TIE. Actually I'd prefer to say it is the maximum type 1 error. The type 1 error is the chance of concluding BE for a product that isn't BE, which is when the true ratio is outside the acceptance range. So when we work with 80.00%125.00% a product is truly not BE when the true ratio is 72%, 77%, 79%. But these three levels will be associated with different levels of power. Regardless of how a product is inequivalent we aim for methods that give a maximum type 1 error of 5%. By way of the nature of the game, the type 1 error becomes smaller as we part further from one of the limits. » Considering a study design under Potvin’s method B framework (type I TSD), i.e. and expected GMR of 0.95, an n1 between 12 and 60 subjects, a CV between 10 and 100% and a target power of 0.8, no simulations are required if, on the end of the study, GMR was 0.95 and CV between 10 and 100%. Am I thinking appropriately? For any approved method I know of it does not have anything to do with what the GMR or CV was at the end of the study as long as you showed BE. » But if expected GMR is for example 0.91, n1 between 12 and 60 subjects, CV between 10 and 100% and target power is 0.8, there is a violation of method B assumptions, right? No. See above. » And therefore simulations are needed based on true data (...) No, this isn't exactly how it works. The "true data" you refer to are your observations  they give an estimate but they do not give you the true ratio. — “A tenyear, doubleblind study from the Mayo Clinic concluded that even in late stages of dementia, the last to go is the lobe of the brain in charge of cafeteria layout.” (Serge Storms/Tim Dorsey). Best regards, ElMaestro  Bootstrapping is a relatively new hobby of mine. I am only 30 years late to the party. 
d_labes Hero Berlin, Germany, 20170309 12:55 @ Silva Posting: # 17147 Views: 3,927 

Dear Silva, additionally to what our great Maestro said: You have to know beforehand if your settings you use in a TSD i.e. adj. alpha, fixed GMR, targetpower, n1, inclusion of futility rules or other are suszeptible to an alphainflation over a range of reasonable true CV. Potvin et.al. have that shown for adj. alpha = 0.0294, targetpower 0.8 and n1 1260 for a range of true CVs of 10100% by simulations, at least for the Type 1 (aka method B) decision scheme. For Type 2 (aka method C) decision scheme some guys believe they have seen an alphainflation in the numbers given. Anders the Great has derived adj. alpha values for other settings of the fixed GMR and targetpower preserving also the maximum TIE <=0.05.
And that is to what end package Power2Stage was invented. The observed GMR(s) and CV in your actual study don't play a role in that game. Not known beforehand . One exception: If you observe a CV > the 'validated' range it may be wise to do simulations with that CV assumed as TRUE. — Regards, Detlew 
Silva Junior Portugal, 20170309 17:01 @ d_labes Posting: # 17148 Views: 3,915 

Dear el_maesto and d_labes Many thanks for your reply and elucidations. I'll read carefuly Fuglsang's paper (2011). I've already studied the other two papers from the same author (2013 and 2014), as well as Helmut's nice review paper. Best rgds 
BEproff Senior Russia, 20170221 10:31 @ Helmut Posting: # 17093 Views: 4,372 

Hi Helmut, Thank you for clarification! 