Benchmark code [Two-Stage / GS Designs]
Hi Hötzi,
I do not have your code, I don't know how to locate it in Github or CRAN, so I can't readily benchmark yours.
Below is a good starting point.
The basis is as follows. In a more or less typical Potvin run, say n1=24 and CV=0.5 we will need to enter stage 2 almost 100% of the time. So we will need to convert a CV to a sample size roughly 1.000.000 times (look up the percentage in the paper). It will need to happen whether or not the code is vectorised.
b. The present way of doing that is to convert the CV, given the assumed GMR, target power, and the alpha levels, to a sample size 1.000.000 times first via Zhang step, and then finally via a "normal" up/down step.
c. The alternative is to invest a little time in creating smart arrays before we convert any CV's (given the assumed GMR, target power, and the alpha levels), and then afterwards do lookups into the sameple size array.
Remember that my background for this is: In the digital world, if we want to dig 1.000.000 holes we will buy a shovel one time and dig all the holes with it. We will not buy a million shovels and use a new one for every hole we wish to dig.
I have written a function Bench.H1 for the alternative way.
Please load your code into Bench.H2 and do a head-on benchmark with Bench.H1 against bench.H2. I could perhaps do it too if I only knew where to locate your code.
Footnote: The code is simplified, it looks up CV=0.5 every time. It could be made more natural by rchisq'ing the CV. Feel free.
If your code is completely vectorised then I imagine you would create a vector of one million values of 0.5 and let the hounds loose on that vector. Same difference for me.
##The benchmark
Late edit: I believe we can even do something like:
- which is faster still and does not explicitly involve a loop.
❝ Would you mind benchmarking your wonder-code (not with proc.time()
since it keeps the overhead). If it’s, say >10times faster than mine (<1 ms), I’ll be interested.
I do not have your code, I don't know how to locate it in Github or CRAN, so I can't readily benchmark yours.
Below is a good starting point.
The basis is as follows. In a more or less typical Potvin run, say n1=24 and CV=0.5 we will need to enter stage 2 almost 100% of the time. So we will need to convert a CV to a sample size roughly 1.000.000 times (look up the percentage in the paper). It will need to happen whether or not the code is vectorised.
b. The present way of doing that is to convert the CV, given the assumed GMR, target power, and the alpha levels, to a sample size 1.000.000 times first via Zhang step, and then finally via a "normal" up/down step.
c. The alternative is to invest a little time in creating smart arrays before we convert any CV's (given the assumed GMR, target power, and the alpha levels), and then afterwards do lookups into the sameple size array.
Remember that my background for this is: In the digital world, if we want to dig 1.000.000 holes we will buy a shovel one time and dig all the holes with it. We will not buy a million shovels and use a new one for every hole we wish to dig.
I have written a function Bench.H1 for the alternative way.
Please load your code into Bench.H2 and do a head-on benchmark with Bench.H1 against bench.H2. I could perhaps do it too if I only knew where to locate your code.
Footnote: The code is simplified, it looks up CV=0.5 every time. It could be made more natural by rchisq'ing the CV. Feel free.
If your code is completely vectorised then I imagine you would create a vector of one million values of 0.5 and let the hounds loose on that vector. Same difference for me.

##The benchmark
Power.calc=function(Nps, GMR, CV, Is.St2=0)
{
s=sqrt(log(1+CV*CV))
nom1=log(1.25/GMR)
nom2=-log(1.25*GMR)
den=s*sqrt(2/(2*Nps))
df=Nps*2-2
if (Is.St2==T) df=df-1
##cat("1-a=", 1-alpha, "df=",df,"\n")
if (Is.St2==0) q=vecQT1[df]
else q=vecQT2[df]
pw=pt( (nom1/den) -q, df) - pt( (nom2/den) +q, df)
##cat("Nom1=", nom1," Nom2=", nom2, "Den=", den,"q=", q, "\n")
##if (is.na(pw)) pw=0.01
##if (pw<=0) pw=0.01
return(pw)
}
FindCritCV.Sec=function(Pwr.T, GMR, Nps, Is.St2=1, d=.00001, iniCV)
{
CV1=iniCV
CV2=CV1*1.05
y2=1
while (abs(y2)>d)
{
y1=Power.calc(Nps, GMR, CV1, Is.St2)-Pwr.T
y2=Power.calc(Nps, GMR, CV2, Is.St2)-Pwr.T
CV3= CV2-y2*(CV2-CV1)/(y2-y1)
CV1=CV2
CV2=CV3
}
return(0.5*(CV1+CV2))
}
GetNps=function(CV)
{
Nps=1
while (vecCritCVs[Nps]<=CV) Nps=Nps+1
return(Nps)
}
Bench.H1=function()
{
GMR=0.95
alpha1=0.0294
alpha2=0.0294
vecQT1<<-c(1:2000)
vecQT2<<-c(1:2000)
for (i in 1:2000)
{
vecQT1[i]<<-qt(p=1-alpha1, df=i)
vecQT2[i]<<-qt(p=1-alpha2, df=i)
}
vecCritCVs<<-rep(0,500)
for (Nps in 6:length(vecCritCVs))
{
if (Nps==6) iniCV=min(0.709*GMR-0.552, 0.709/GMR -0.552)
else iniCV=vecCritCVs[Nps-1]
vecCritCVs[Nps]<<-FindCritCV.Sec(0.8, GMR, Nps, 1, 1e-9, iniCV)
}
for (i in 1:1000000)
{
SampleSize=GetNps(0.5)
}
return("foo")
}
Bench.H1()
Bench.H2=function()
{
for (i in 1:1000000)
{
SampleSize= ##Hötzi insert your call to the correct function here
}
return("bar")
}
##now go and benchmark Bench.H1 against Bench.H2
Late edit: I believe we can even do something like:
Nps= 2* length(vecCritCVs[vecCritCVs<=CV])
- which is faster still and does not explicitly involve a loop.
—
Pass or fail!
ElMaestro
Pass or fail!
ElMaestro
Complete thread:
- Initial sample size guess for the Potvin methods ElMaestro 2017-08-19 15:04 [Two-Stage / GS Designs]
- Initial sample size guess for the Potvin methods Helmut 2017-08-19 16:06
- Initial sample size guess for the Potvin methods ElMaestro 2017-08-19 16:14
- Initial sample size guess for the Potvin methods Helmut 2017-08-19 17:12
- Initial sample size guess for the Potvin methods ElMaestro 2017-08-19 17:17
- Confuse-a-Cat Helmut 2017-08-19 17:33
- Confuse-a-Cat ElMaestro 2017-08-19 17:56
- Confuse-a-Cat Helmut 2017-08-19 17:33
- Initial sample size guess for the Potvin methods ElMaestro 2017-08-19 17:17
- loop ↔ vectorized ↔ direct Helmut 2017-08-20 14:40
- loop ↔ vectorized ↔ direct ElMaestro 2017-08-20 15:22
- loop ↔ vectorized ↔ direct Helmut 2017-08-20 16:23
- loop ↔ vectorized ↔ direct ElMaestro 2017-08-20 17:22
- loop ↔ vectorized ↔ direct Helmut 2017-08-20 16:23
- loop ↔ vectorized ↔ direct ElMaestro 2017-08-20 15:22
- Initial sample size guess for the Potvin methods Helmut 2017-08-19 17:12
- Initial sample size guess for the Potvin methods ElMaestro 2017-08-19 16:14
- The n ext crackpot iteration ElMaestro 2017-08-19 20:04
- The n ext crackpot iteration Helmut 2017-08-20 02:20
- The ultimate crackpot iteration! ElMaestro 2017-08-20 14:35
- The ultimate crackpot iteration! Helmut 2017-08-20 15:11
- The ultimate crackpot iteration! ElMaestro 2017-08-20 15:28
- The ultimate crackpot iteration! Helmut 2017-08-20 16:06
- The ultimate crackpot iteration! ElMaestro 2017-08-20 16:15
- The ultimate crackpot iteration! Helmut 2017-08-20 18:58
- The ultimate crackpot iteration! ElMaestro 2017-08-20 19:32
- Suggested code ElMaestro 2017-08-21 18:13
- Nitpicker! Helmut 2017-08-22 13:33
- Nitpicker! ElMaestro 2017-08-22 17:27
- Nitpicker! Helmut 2017-08-22 17:49
- Nitpicker! ElMaestro 2017-08-22 17:59
- Nitpicker! Helmut 2017-08-22 19:15
- Benchmark codeElMaestro 2017-08-22 22:29
- Benchmark code Helmut 2017-08-23 01:48
- Benchmark codeElMaestro 2017-08-22 22:29
- Nitpicker! Helmut 2017-08-22 19:15
- Nitpicker! ElMaestro 2017-08-22 17:59
- Nitpicker! Helmut 2017-08-22 17:49
- Nitpicker! ElMaestro 2017-08-22 17:27
- Nitpicker! Helmut 2017-08-22 13:33
- Suggested code ElMaestro 2017-08-21 18:13
- The ultimate crackpot iteration! ElMaestro 2017-08-20 19:32
- The ultimate crackpot iteration! Helmut 2017-08-20 18:58
- The ultimate crackpot iteration! ElMaestro 2017-08-20 16:15
- The ultimate crackpot iteration! Helmut 2017-08-20 16:06
- The ultimate crackpot iteration! ElMaestro 2017-08-20 15:28
- The ultimate crackpot iteration! Helmut 2017-08-20 15:11
- Initial sample size guess for the Potvin methods Helmut 2017-08-19 16:06