Bioequivalence and Bioavailability Forum • Trials to a given precision; Sample size

Ben
★

2011-11-14 20:16
(4983 d 17:36 ago)

Posting: # 7672
Views: 10,259

Trials to a given precision; Sample size [Power / Sample Size]

Dear all,

in clinical development one often performs trials in order to estimate possible effects instead of testing whether there is a significant effect or not. Therefore one is more interested in the confidence interval itself and requires this to be no larger than a pre-specified width 2w. The question what sample size should be used boils down to solving w >= qantile*sqrt(Var(S)) iteratively, where Var(S) depends on n. As the focus is on the confidence interval it is however recommended to assure that with a certain probability 1-gamma (the so called coverage probability) the half-width of the CI is actually not greater than w, see Kupper and Hafner in their 1989 article "How Appropriate Are Popular Sample Size Formulas?" (The American Statistician, vol 43, pp 101-105). They consider the one-sample and the two-sample case. nQuery also has a way to calculate the desired sample size, but again only for a two group design (module MTG) and a 2x2 design (module MOC) (actually nQuery calls it paired design). So I tried to implement the procedure, extending it a little bit to allow for more possible designs (as in the function CVfromCI from the R package PowerTOST).

require(PowerTOST)

ss_ci <- function(w, sigma, design="2x2", alpha=0.1, gamma=0.05) {

# w is the desired precision

# 1-alpha is the confidence level

# 1-gamma is the coverage probability



# get degrees of freedom

# and design constant

# (copied from CVfromCI)

d.no <- .design.no(design)

if (is.na(d.no)) 

   stop("Unknown design!", call. = FALSE)

desi <- .design.props(d.no)

dfe <- parse(text = desi$df[1], srcfile = NULL)

bk <- desi$bk



# Calculate n: precision CI w/o coverage probability

#

n <- 1

while ( eval(dfe) < 1 ) {

  n <- n+1

}

n_start <- n

t_n <- qt(1-alpha/2, eval(dfe))



while ( n < bk*((sigma/w) * t_n)^2 ) {

  n <- n+1

  t_n <- qt(1-alpha/2, eval(dfe))

}

n_nocp <- n

cat("n w/o coverage probability = ", n_nocp, "\n")



# Calculate new n: precision CI with coverage probability

#

n <- n_start

chi <- qchisq(1-alpha, 1)

chi_n <- qchisq(1-gamma, eval(dfe))

f_n <- qf(1-alpha, 1, eval(dfe))



while ( bk*n*(n-1)/n_nocp < (chi_n*f_n)/chi ) {

  n <- n+1

  chi_n <- qchisq(1-gamma, eval(dfe))

  f_n <- qf(1-alpha, 1, eval(dfe))

}

cat("final n with coverage probability:", "\n")

return(n)

}

I played a little bit and some issues/questions came up.

Using ss_ci with design "2x2", some precision w and sigma s I do not get the same sample sizes as nQuery's (version 6) paired means module MOC. I do get the same results as MOC (which by the way uses s*sqrt(2) as variability because it enables nQuery to use the same formula for both the two group design and the crossover design) when using the "parallel" design (with s as input variability). So the first two questions that pop into my mind are: Are my calculations in error or those of nQuery? Does nQuery use the correct degrees of freedom?
The second issue I observed when using the 2x2 design is the following. As the half-width w decreases (less than or equal to 0.1 if sigma=0.2) and all other parameters remain the same the initial sample size (based only on w >= quantile*sqrt(Var(S))) will be larger than the final sample size that includeds the coverage probability. The reason for that is the new degrees of freedom, but this does not make any sense, does it?
The final sample size in the 2x2 case will be less than the final sample size obtained from a 3x3, 4x4, 2x2x3 or a 2x2x4 design, which again does not make sense?

Is the whole approach nonsense or can this be fixed?

Best,
Ben

d_labes
★★★

Berlin, Germany,
2011-11-16 10:40
(4982 d 03:13 ago)

@ Ben
Posting: # 7675
Views: 9,053

Sample size for a given precision of CI

Post reply

Dear Ben,

I must confess that I haven't dealt with sample size planning to a given precision of the confidence interval up to now. Within that forum this type of sample size planning is a little bit Off Topic.

All sample size planning within BE studies is usually done via power of the underlying two one-sided t-tests (TOST) for the bioequivalence decision. That's the reason I had written the R package PowerTOST.
Cool that you have discovered some hidden helper functions and tried to reuse them :cool:

. But beware! These functions are intended for internal use only and are therefore undocumented. Using them needs an understanding what they do and of the assumptions underlying the returned results.

So in the end I can't say something specific to your problems without delving deeper into the mathematical apparatus of sample size planning with the aim of attaining a given precision. For that I haven't enough spare time in the moment.

Only some remarks and tips:

I have found it always better to be doubtful first to the own implementation than to question the results of others.
If you don't trust commercial software (of course there may be bugs or some erroneous implementations as we all know) try to get some results or test cases from other sources. For your problem I recommend you some sources written by the "pope of sample sizes"

S.A. Julious
"Sample sizes for clinical trials"
Chapman & Hall/CRC
Taylor & Francis Group
Boca Raton 2010

S.A. Julious
"Tutorial in biostatistics: Sample sizes for clinical trials with normal data"
Stat. Med. 2004, 23, p1921-1986

Both contain worked examples and tables of sample sizes you can use to check your implementation.
Have a look into Dave Dubbins FARTSSIE.xls (it's free) which contains sheets dealing with your problem.
The degrees of freedom and the so-called design constant bk you are reusing are defined within PowerTOST for the parallel group design based on the number of subjects in one group assuming equal number in both (i.e. n_total=2*n).
For the crossover designs they are defined in terms of the total number of subjects assuming equal number in each sequence group.
So check what your implementation will give or needs. Check also what nQuery gives.
I'm not quite sure if the design constants are specific for the TOST procedure and make sense for your problem. Check the underlying formulas for each design you will implement with respect to these constants.

Hope this helps.

—
Regards,

Detlew

Ben
★

2011-11-18 17:57
(4979 d 19:55 ago)

@ d_labes
Posting: # 7682
Views: 8,992

Sample size for a given precision of CI

Post reply

Thanks for your reply!
Sorry, I didn't know that this is off topic, because I assumed this falls into the category "bioavailability".

Yes, of course I am questioning my program. But on the other hand I tried to check as good as possible. In the article from Kupper and Hafner there are tables (unfortunately for the one sample and the two sample case only) which show the final sample size given an initial sample size. I checked my program according to this table - it works out fine. Also, I checked my results via FARTSSIE now (thanks for the link). In this spreadsheet only the case without coverage probability is considered. These values coincide with the "initial sample size" from my program (the first while loop) - at least for the test cases I used. nQuery also has a module for calculating such a sample size (i.e. without coverage probability) but here nQuery uses z-quantiles (I again get the same results by changing the t-quantiles to z-quantiles).

Some more rough results (incorporating the coverage probability)

for the parallel design the program above gives about 2 subjects more per group than nQuery's MTC.
for the 2x2 design the program above gives 1-2 subjects more (total n) than nQuery's MOC. However, the weird thing here is that I plugged in the same value for the standard deviation to get these similarities. That is, for instance sigma=0.2 as input for my program and also 0.2 for the standard deviation of differences in nQuery. But the latter one should actually be 0.2*sqrt(2)... When comparing the sample size obtained with input 0.2 (my program) to 0.2*sqrt(2) in nQuery then I can't see a pattern so far (sometimes only 1-2 off, but sometimes more than 10 off). Besides that, the problem that the final sample size is smaller than the initial one (obtained by the first while, w >= quantile*sqrt(Var(S))) remains. Shouldn't one at least consider taking the maximum?

❝ I'm not quite sure if the design constants are specific for the TOST procedure and make sense for your problem. Check the underlying formulas for each design you will implement with respect to these constants. Hope this helps.

Me either, but the same argument applies to CVfromCI? CVfromCI isn't TOST specific, is it?
I have to learn more about the design constant, and I will check out the references you gave me.
Thanks again for all the comments.

Best regards,
Ben

Ben
★

2011-11-21 20:13
(4976 d 17:40 ago)

@ Ben
Posting: # 7694
Views: 8,842

Sample size for a given precision of CI

Post reply

Dear all,

I started from scratch. The only thing that's important is the requirement
P(S·(bk/n)^-1/2·t_{df, 1-α/2} ≤ ω) ≥ 1-γ,

where df is the appropriate degrees of freedom of the considered design, S the estimate of the standard deviation and bk the design constant. We want the smallest integer so that the above inequality holds. In the article they derive inequalities that are easier to handle than the probability. Since nowadays evaluating P is no problem (fast computers) we can directly go with the above relation. (The reason why I got different results was that (i) one derived inequality was conservative and (ii) the initial sample size was based on the z-quantile and not on the t-quantile.) Now the above inequality can be equivalently stated as
P(df·S²/σ² ≤ (ω/σ)²·df·(bk/n)1/F_{1, df, 1-α}) ≥ 1-γ

Then (df·S²)/σ² is chi squared distributed with df degrees of freedom (is that always true for every considered design?) and we can set up the following code to get the desired sample size:


require(PowerTOST)

ss_ci <- function(w, sigma, design="2x2", alpha=0.1, gamma=0.05) {

# w is the desired precision

# 1-alpha is the confidence level

# 1-gamma is the coverage probability



# get degrees of freedom

# and design constant

# (copied from CVfromCI)

d.no <- .design.no(design)

if (is.na(d.no)) 

   stop("Unknown design!", call. = FALSE)

desi <- .design.props(d.no)

dfe <- parse(text = desi$df[1], srcfile = NULL)

bk <- desi$bk



n <- 1

while ( eval(dfe) < 1 ) {

  n <- n+1

}

x <- (w/sigma)^2*eval(dfe)*n/bk*1/qf(1-alpha, 1, eval(dfe))

prob <- pchisq(x, eval(dfe))

while ( prob < 1-gamma ) {

  n <- n+1

  x <- (w/sigma)^2*eval(dfe)*n/bk*1/qf(1-alpha, 1, eval(dfe))

  prob <- pchisq(x, eval(dfe))

}

return(n)

}

Again, I will look into the design constant and the correct usage of it more closely but at the moment I don't see a difference to CVfromCI(). For the parallel and the 2x2 crossover design the results are the same as those from nQuery (sometimes it differs, but only +-1, probably due to rounding/higher precision).

Comments are highly appreciated.

Best regards,
Ben

d_labes
★★★

Berlin, Germany,
2011-11-22 16:37
(4975 d 21:15 ago)

@ Ben
Posting: # 7699
Views: 8,841

Some minor comments

Post reply

Dear Ben,

❝ ... at the moment I don't see a difference to CVfromCI.

Correct

. The bk apply here also if you use the same number of subjects in the (sequence) groups.

❝ For the parallel and the 2x2 crossover design the results are the same as those from nQuery (sometimes it differs, but only +-1, probably due to rounding/higher precision).

Congratulation :clap:

for the agreement with nQuery. Also I don't have nQuery and I haven't found the formulas they use in the nQuery handbooks available on the net I strongly argue that the differences are more probably due to the different number of degrees of freedom, namely df=n-1 in the so-called 'paired' design. You (having access to nQuery) can test it with my code suggestions below.
BTW: Jerry Hinze's PASS also has a module "Confidence Intervals for Paired Means with Tolerance Probability" with df=n-1.

My checks with your code against the tables in the Joulious book show total agreement for the parallel group design as well as for the 2x2 crossover.

Here some minor code suggestions:

ss_ci <- function(w, sigma, design="2x2", alpha=0.1, gamma=0.05, debug=FALSE) 

{# w is the desired precision (CI halfwidth)

 # 1-alpha is the confidence level

 # 1-gamma is the coverage probability (PASS calls this tolerance probability)   

  if (gamma<=0) stop("gamma must lie between >0 and <0.5 !", call. = FALSE)

  if (gamma>0.5) stop("gamma > 0.5 does'nt make sense!", call. = FALSE)

  if (alpha>=1 | alpha<=0) stop("alpha must be 0 < alpha <1 !", call. = FALSE)

  if (sigma<=0 ) stop("sigma must be > 0 !", call. = FALSE)

  

  if (design != "paired") {  

  # get degrees of freedom and design constant

  # (copied from CVfromCI)   

  d.no <- .design.no(design)

  if (is.na(d.no)) stop("Unknown design!", call. = FALSE)

    desi <- .design.props(d.no)

    dfe  <- parse(text = desi$df[1], srcfile = NULL)

    bk   <- desi$bk

    step <- desi$steps

  } else {

    # this should give the results for PASS, nQuery

    # in PASS the sigma of the paired diffs has to be given =sqrt(2)*sigma 

    bk <- 2; step <- 1

    dfe  <- parse(text = "n-1", srcfile = NULL)

  }

  n  <- 1

  # DL: while loop replaced 

  n <- -eval(dfe) + 2

  # DL addition:

  # normal approximation as starting value 

  # gives for all examples checked sampsiz lower than final result 

  # but avoids brute force counter from n=3 on 

  # this may result in thousands if iterations: check w=0.05, sigma=1 with debug=TRUE 

  z  <- qnorm(1-alpha/2)

  n1 <- trunc(bk * z^2*(sigma/w)^2 + z^2/4) #Julious (8.20) 

  n  <- max(n,n1)

  # DL: make an multiple of step size (see below) 

  n  <- step*trunc(n/step)

  df <- eval(dfe)

  x  <- (w/sigma)^2*eval(dfe)*n/bk/qf(1-alpha, 1, df)

  prob <- pchisq(x,df)

  if (debug) cat("n=",n,"prob=",prob,"\n") # for testing purposes

  while ( prob < 1-gamma ) {

    # DL: since the 'design constants' are derived for equal numbers in the 

    # (sequence) groups it would be a fine idea to increment by step size 

    n <- n+step # this gives the Julious numbers rounded up to even 

    df <- eval(dfe)

    x <- (w/sigma)^2*df*n/bk/qf(1-alpha, 1, df)

    prob <- pchisq(x, df)

    if (debug) cat("n=",n,"prob=",prob,"\n") # for testing purposes 

  }

  return(n)

}

—
Regards,

Detlew

Ben
★

2011-11-23 22:12
(4974 d 15:40 ago)

@ d_labes
Posting: # 7704
Views: 8,793

Some minor comments

Post reply

Thank you very much for your effort and help!

❝ ... I strongly argue that the differences are more probably due to the different number of degrees of freedom, namely df=n-1 in the so-called 'paired' design. You (having access to nQuery) can test it with my code suggestions below.

You're absolutely right! I didn't check very thoroughly but so far it seems that exactly these df are being used (this indeed makes sense as the CI is for difference in "paired means")

❝ Here some minor code suggestions: ...

You put in some nice features (starting value of n and the step size), awesome. One suggestion/question from myself: should it be n <- step*ceiling(n/step) instead of n <- step*trunc(n/step)? Otherwise we will get an error using e.g. w=0.3, sigma=0.2, design="2x2".

Also, one could think of including n=NULL or something like that as input parameter and then return the precision if a sample size is given, that is,


if (is.null(n)) {

...

} else {

   df <- eval(dfe)

   if (df < 1) {

   stop("n is too small!", call. = FALSE)

   }

   w <- sigma*sqrt(qchisq(1-gamma,df)/df*bk/n*qf(1-alpha,1,df))

   return(w)

}

The values of w coincide with those of nQuery (using "paired").

Best regards

Ben
★

2011-11-25 18:28
(4972 d 19:24 ago)

@ Ben
Posting: # 7726
Views: 8,726

Some minor comments

Post reply

❝ My checks with your code against the tables in the Joulious book show total agreement for the parallel group design as well as for the 2x2 crossover.

I also checked against these tables, but I do not always get the same result as in the table. For example, using


w=0.5*0.2, sigma=0.2, design="2x2", alpha=0.05, gamma=0.05

(i.e. delta=0.5) I get 46. Julious has 45. I do get the same results if the design is "paired", but Julious obviously is talking about cross-over (df = n-2)?!

Sorry, that's just because we increase n by step. 45 is the result if we increase by 1.

Edit: Added from a follow-up post; you may edit your post within 24 hours. [Helmut]

d_labes
★★★

Berlin, Germany,
2011-12-15 10:28
(4953 d 03:24 ago)

@ Ben
Posting: # 7782
Views: 8,451

Using undocumented functions - PowerTOST v0.9-0

Post reply

Dear Ben,

my above warning about using undocumented, internal functions now comes into effect. To release a new version of PowerTOST I got stuck with a new technical demand for R-packages from R version 2.14.0 on, the so-called NAMESPACE (whatever this really is doesn't fit in my small brain).

The consequence of my limited understanding of that matter is that hidden functions from package PowerTOST are no longer callable from outside that simple.

If I understood correctly you can try the syntax
PowerTOST:::.design.no(design)
Using such a syntax it is even not necessary to load the package via library() or require() :cool:

.

Cough :smoke:

, sorry for any inconvenience.

—
Regards,

Detlew

Ben ★ 2011-12-16 20:06 (4951 d 17:46 ago) @ d_labes Posting: # 7785 Views: 8,444	Using undocumented functions - PowerTOST v0.9-0 Post reply
	Thanks for the hint, I'll keep this in mind!