Bioequivalence and Bioavailability Forum

BE-proff ● 2019-02-04 13:12 (1900 d 23:17 ago) Posting: # 19855 Views: 4,637	Formulas to calculate sample size [Power / Sample Size] Post reply
	Hi All, Recently I have started learning Python and I'd like to write script calculating sample size (the same as sampleN.TOST function in R). Can anybody share formula of formulas set for relevant calculations?

ElMaestro
★★★

Denmark,
2019-02-04 14:27
(1900 d 22:01 ago)

@ BE-proff
Posting: # 19857
Views: 3,980

Formulas to calculate sample size

Post reply

Hi BE-proff,

❝ Can anybody share formula of formulas set for relevant calculations? :confused:

This is a great challenge you awarded yourself. :-D

Sample size calculation is not a set of deterministic equations (yet?!?), but usually involve integrals.

For a start try and look at Potvin's paper.
Here you'll find a very decent approximation which is based on the central t test. This provides the "exact" solution when the expected match is 100% and it is a very decent* approximation in most other relevant case whenever desired power is above the magical 80%. Perhaps you can start there and then make it more complicated and exact once you get into it?

Here's how I would approach it (and I am assuming you are not relying on external libraries**):
1. Make a function for the density of central t at any level of df.
2. Make a function that can integrate it, from minus infinity to some arbitrary value x.
3. Make a function that can find the critical value of the central t for an arbitrary p-level between 0 and 1. This one may be a little tricky if you write it from scratch but you can find inspiration by googling Russel Lenth's ASA243 Al Gore Rhythm. I imagine you can also find some py code on the www but I am not certain about it.

With those three at hand you can reproduce power.TOST sand sampleN.TOST values using the central t approximation by insertion into the equation from Potvin. You may be able to find a way to import and use gsl or similar libraries in python, then it is simply a matter of calling some functions. Can't help with the implementation of it as I don't do python.

Next, if you are hardcore you can look into noncentral t and Owen's Q. If you can write the framework above then you have all the skill at hand to also do this last step.
This will give you the opportunity to get anything power.TOST does.

Happy coding. Interested to hear of your further achievements, so please keep me posted.

* I am sure the BE police will spank me for saying so but I am willing to defend my words. :-D

** I am also willing to at least read posts asking why someone would want to reinvent the wheel. :-D

—
Pass or fail!
ElMaestro

BE-proff ● 2019-02-04 14:43 (1900 d 21:46 ago) @ ElMaestro Posting: # 19858 Views: 4,003	Formulas to calculate sample size Post reply
	Hi ElMaestro, Thank you for clarification. It looks really very challenging... So I can't promise you quick response - wait for some time.

ElMaestro
★★★

Denmark,
2019-02-04 18:17
(1900 d 18:11 ago)

@ BE-proff
Posting: # 19864
Views: 4,303

Hey, python is not so difficult :p

Post reply

Hi BE-proff,

I just downloaded python and played around with it.
Syntatically it is not so hard, I think.

Below are some functions that will get you started. They execute just fine as a script on my computer (Win10). The functions use Simpson integrals and various constants that that you can play around with to achieve the combination of accuracy and speed that suits you. Note that I only made step 1-3, so I will leave it to you to put these into Potvin's equation.
They are in no way optimized so there's plenty of work to do still :-D


import math





def GammaB(z):

 x=0.0

 dx=0.04

 integral=0.0

 di=10

 it=0

 while ((di>0.001) or  (x<8*z)):

  y =pow(x, z-1)*math.exp(-x)

  y1=pow(x+dx, z-1)*math.exp(-(x+dx))

  y2=pow(x+dx+dx, z-1)*math.exp(-(x+dx+dx))

  di= (dx/3)*(y+4*y1+y2)

  integral=integral +di

  x=x+dx+dx

  it+=1

 return(integral)





def DensityB(x, df):

 a=GammaB((df+1)/2.0 )

 b=math.sqrt(df*math.pi) * GammaB(df/2.0 )

 c=math.pow(1.0+x*x/df, -( (df+1) / 2.0))

 return(a*c/b)



def probtcum(df, t):

## note: for t>0 only, you can easily fix it for negative t

 x=0.0

 integral=0.5

 dx=0.5*t/100

 i=0

 while (i<100):

  y=DensityB(x, df)

  y1=DensityB(x+dx, df)

  y2=DensityB(x+dx+dx, df)  

  di = (dx/3)* (y+4*y1+y2)

  integral=integral+di

  x=x+dx+dx

  i+=1

     

 return(integral)



def critvalt(df, p):

##fix it yourself for p lowe than 0.5 

 x=0.0

 integral=0.5

 dx=0.004

 while (integral<p):

  y=DensityB(x, df)

  y1=DensityB(x+dx, df)

  y2=DensityB(x+dx+dx, df)  

  di = (dx/3)* (y+4*y1+y2)

  integral=integral+di

  x=x+dx+dx

 ##aha!! now the solution is between x and x-2dx

 ##so we can just interpolate linearly

 a=di / (dx+dx)

 b=integral-a*x

 soln=(p-b) /a

 return(soln)





## in R, pt(df=5, 0.4) is 0.6471634

p=probtcum(5, 0.4)

print("probt cumul at df=5 for x=0.4", p, "should be", 0.6471634)



## in R, qt(df=11, 0.95) is 1.795885

q=critvalt(11, 0.95)

print("critt at df=11 and p=0.95=", q, "should be", 1.795885)

On my machine I get:


>>> 

RESTART: [blahblah]


probt cumul at df=5 for x=0.4 0.6471629404086429 should be 0.6471634

critt at df=11 and p=0.95= 1.7958969293454883 should be 1.795885

>>>

—
Pass or fail!
ElMaestro

ElMaestro
★★★

Denmark,
2019-02-04 23:31
(1900 d 12:58 ago)

@ BE-proff
Posting: # 19865
Views: 3,848

experiments with a function optimization

Post reply

Hey,

learning python is great fun :ok:

The code below is executing quite instantaneously on my machine.
It involves a simple re-write of the density function using the very handy shortcut noted on wikipedia.


import math





def DensityC(x, df):

    if ((df % 2) ==0):

       q=df-1

       Den=2*math.sqrt(df)

       Num=1

       while (q>=3):

           Num = Num*q

           Den = Den*(q-1)

           q=q-2

       x=(Num/Den)*math.pow(1.0+x*x/df, -( (df+1) / 2.0))

    if ((df % 2) ==1):

       q=df-1

       Den=math.pi*math.sqrt(df)

       Num=1

       while (q>=1):

           Num = Num*q

           Den = Den*(q-1)

           q=q-2

       x=(Num/Den)*math.pow(1.0+x*x/df, -( (df+1) / 2.0))

    return(x)











def probtcum(df, t):

## note: for t>0 only, you can easily fix it for negative t

 x=0.0

 integral=0.5

 dx=0.5*t/100

 i=0

 while (i<100):

  y=DensityC(x, df)

  y1=DensityC(x+dx, df)

  y2=DensityC(x+dx+dx, df)  

  di = (dx/3)* (y+4*y1+y2)

  integral=integral+di

  x=x+dx+dx

  i+=1

     

 return(integral)



def critvalt(df, p):

##fix it yourself for p lowe than 0.5 

 x=0.0

 integral=0.5

 dx=0.0004

 while (integral<p):

  y=DensityC(x, df)

  y1=DensityC(x+dx, df)

  y2=DensityC(x+dx+dx, df)  

  di = (dx/3)* (y+4*y1+y2)

  integral=integral+di

  x=x+dx+dx

 ##aha!! now the solution is between x and x-2dx

 ##so we can just interpolate linearly

 a=di / (dx+dx)

 b=integral-a*x

 soln=(p-b) /a

 return(soln)





## in R, pt(df=5, 0.4) is 0.6471634

p=probtcum(5, 0.4)

print("probt cumul at df=5 for x=0.4: ", p, "should be", 0.6471634)



## in R, qt(df=11, 0.95) is 1.795885

q=critvalt(11, 0.95)

print("critt at df=11 and p=0.95:     ", q, "should be", 1.795885)

Now don't feed it non-integer df's without extending the density function appropriately.

—
Pass or fail!
ElMaestro

BE-proff ● 2019-02-05 08:19 (1900 d 04:10 ago) @ ElMaestro Posting: # 19866 Views: 3,897	experiments with a function optimization Post reply
	Hi ElMaestro, Star is shocked Since my math skills are not very strong I will hardly make out the idea But nevertheless I have managed to realize CVfromCI function in pyto

Formulas to calculate sample size [Power / Sample Size]

Formulas to calculate sample size

Formulas to calculate sample size

Hey, python is not so difficult :p

experiments with a function optimization

experiments with a function optimization