Benchmark code [Two-Stage / GS Designs]

posted by ElMaestro  – Denmark, 2017-08-22 22:29 (1355 d 11:59 ago) – Posting: # 17740
Views: 24,954

(edited by ElMaestro on 2017-08-23 00:31)

Hi Hötzi,

» Would you mind benchmarking your wonder-code (not with proc.time() since it keeps the overhead). If it’s, say >10times faster than mine (<1 ms), I’ll be interested. :-D

I do not have your code, I don't know how to locate it in Github or CRAN, so I can't readily benchmark yours.

Below is a good starting point.
The basis is as follows. In a more or less typical Potvin run, say n1=24 and CV=0.5 we will need to enter stage 2 almost 100% of the time. So we will need to convert a CV to a sample size roughly 1.000.000 times (look up the percentage in the paper). It will need to happen whether or not the code is vectorised.

b. The present way of doing that is to convert the CV, given the assumed GMR, target power, and the alpha levels, to a sample size 1.000.000 times first via Zhang step, and then finally via a "normal" up/down step.
c. The alternative is to invest a little time in creating smart arrays before we convert any CV's (given the assumed GMR, target power, and the alpha levels), and then afterwards do lookups into the sameple size array.
Remember that my background for this is: In the digital world, if we want to dig 1.000.000 holes we will buy a shovel one time and dig all the holes with it. We will not buy a million shovels and use a new one for every hole we wish to dig.

I have written a function Bench.H1 for the alternative way.
Please load your code into Bench.H2 and do a head-on benchmark with Bench.H1 against bench.H2. I could perhaps do it too if I only knew where to locate your code.
Footnote: The code is simplified, it looks up CV=0.5 every time. It could be made more natural by rchisq'ing the CV. Feel free.

If your code is completely vectorised then I imagine you would create a vector of one million values of 0.5 and let the hounds loose on that vector. Same difference for me.:-)

##The benchmark
Power.calc=function(Nps, GMR, CV, Is.St2=0)
  if (Is.St2==T) df=df-1
  ##cat("1-a=", 1-alpha, "df=",df,"\n")
  if (Is.St2==0) q=vecQT1[df]
    else         q=vecQT2[df]
  pw=pt( (nom1/den) -q, df) - pt( (nom2/den) +q, df)
  ##cat("Nom1=", nom1," Nom2=", nom2, "Den=", den,"q=", q, "\n")
  ##if ( pw=0.01
  ##if (pw<=0) pw=0.01

FindCritCV.Sec=function(Pwr.T, GMR, Nps, Is.St2=1, d=.00001, iniCV)
  while (abs(y2)>d)
   y1=Power.calc(Nps, GMR, CV1, Is.St2)-Pwr.T
   y2=Power.calc(Nps, GMR, CV2, Is.St2)-Pwr.T
   CV3= CV2-y2*(CV2-CV1)/(y2-y1)

  while (vecCritCVs[Nps]<=CV) Nps=Nps+1

 for (i in 1:2000)
    vecQT1[i]<<-qt(p=1-alpha1, df=i)
    vecQT2[i]<<-qt(p=1-alpha2, df=i)

 for (Nps in 6:length(vecCritCVs))
   if (Nps==6) iniCV=min(0.709*GMR-0.552, 0.709/GMR -0.552)
               else iniCV=vecCritCVs[Nps-1]
    vecCritCVs[Nps]<<-FindCritCV.Sec(0.8, GMR, Nps, 1, 1e-9, iniCV)
 for (i in 1:1000000)


 for (i in 1:1000000)
  SampleSize= ##Hötzi insert your call to the correct function here

##now go and benchmark Bench.H1 against Bench.H2

Late edit: I believe we can even do something like:
Nps= 2* length(vecCritCVs[vecCritCVs<=CV])
- which is faster still and does not explicitly involve a loop.

Pass or fail!

Complete thread:

 Admin contact
21,445 posts in 4,482 threads, 1,510 registered users;
online 14 (0 registered, 14 guests [including 4 identified bots]).
Forum time: Sunday 10:29 CEST (Europe/Vienna)

It is better to know some of the questions
than all of the answers.    James Thurber

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz