Bioequivalence and Bioavailability Forum • Exact power and sample size, Part I

d_labes
★★★

Berlin, Germany,
2009-11-23 15:32
(5687 d 08:11 ago)

Posting: # 4374
Views: 14,705

Exact power and sample size, Part I [🇷 for BE/BA]

Dear All!

In the era of 'Appelssstrudel' and Owen's Q-function as implemented in R (see here) R-users don't have the necessity to use approximations but should calculate always exact power and sample size!

Since other designs (parallel group, cross-over with more than 2 formulations, replicate designs) used occasionally in BE studies also rely on the two-one-sided t-tests procedure (TOST) here comes my R-super-code:


# -- The functions of normal-, t-distributions and integrate() --------------

require(stats)

# ---------------------------------------------------------------------------

# Owen's Q-function 

OwensQ <- function (nu, t, delta, a, b)

{

   # Observation: for 'really' large df (nu>2000) and large delta/b the  

   # density function is zero over nearly all its range. Q is than  

   # returned sometimes as =0! See documentation (?integrate) for that.

   # example: OwensQ(3000,1.64,-10,0,300) = 1

   #          OwensQ(3000,1.64,-10,0,350) = 5.614476e-12 !

   # Idea: adapt upper and/or lower to account for that

   low <- a; up<- b

   if (nu>=1000){

     # try to shorten the range over that is integrated

     x <- seq(a, b, by=(b-a)/400)

     dens <- .Q.integrand(x, nu, t, delta)

     r <- data.frame(x=x, dens=dens)

     r <- r[r$dens>0,]
     if (length(r)>0) {# if any >0

       up <- max(r$x)+(b-a)/600  # only the upper x+step is used

       rm(r,x,dens)

     }

   }

   Qintegral <- integrate(.Q.integrand, lower = low, upper = up, nu=nu, t=t, 

                delta = delta, subdivisions = 10000, 

                rel.tol = 1.e-10, abs.tol=1.e-10, 

                stop.on.error = TRUE, keep.xy = FALSE, aux = NULL)

   Q <- Qintegral[[1]]



   return(Q)  

}

# --------------------------------------------------------------------------

# Integrand of the definit integral in Owen's Q. Used in the call of integrate()

# Not useful alone, I think ? Leading . hides this function    

.Q.integrand <- function(x, nu, t, delta)

{ #version without for - loop

   lnQconst <- -((nu/2.0)-1.0)*log(2.0) - lgamma(nu/2.)

   dens <- pnorm( t*x/sqrt(nu) - delta, mean = 0, sd = 1, log = FALSE) * 

                 exp( (nu-1)*log(x) - 0.5*x^2 + lnQconst )



   return(dens)

}

# --------------------------------------------------------------------------

# 'raw' power function to be used in sampleN.TOST avoiding overhead

# bk is the so called design constant (see later) 

.power.TOST <- function(alpha=0.05, theta1, theta2, diffm, se, n, df, bk=2)

{

    tval   <- qt(1 - alpha, df, lower.tail = TRUE, log.p = FALSE)

    delta1 <- (diffm-theta1)/(se*sqrt(bk/n))

    delta2 <- (diffm-theta2)/(se*sqrt(bk/n))

    R      <- (delta1-delta2)*sqrt(df)/(2.*tval)

    

    # to avoid numerical errors in OwensQ implementation

    if (df>10000) { 

      # Joulious formula (57) or (67), normal approximation 

      p1 <- pnorm( (abs(delta1)-tval), lower.tail = TRUE)

      p2 <- pnorm( (abs(delta2)-tval), lower.tail = TRUE)

      return (p1 + p2 -1)

    }

    

    p1 <- OwensQ(df,  tval, delta1, 0, R)

    p2 <- OwensQ(df, -tval, delta2, 0, R)



    return( p2-p1 )

}

Continuation follows.

Use it on your own risk! The author does not take any responsibility for damage of your computer, your career or of the world :-D

—
Regards,

Detlew

d_labes
★★★

Berlin, Germany,
2009-11-23 15:38
(5687 d 08:04 ago)

@ d_labes
Posting: # 4375
Views: 12,537

Exact power and sample size, Part II

Post reply

Dear All!

Here comes part II of the story:


# a bunch of functions to get the design characteristics

known.designs<-function()

{

# Note: the df for the replicate designs are those without carry-over

# n is the total number in case of cross-over design, 

# n is number per group in case of parallel group design

# df2 = degrees of freedom for robust analysis (aka Senn's basic estimator)

# bk is the so-called design constant

  des <- ("no  design    df      df2    steps  bk

           0  parallel 2*(n-1) 2*(n-1)  1      2

           1  2x2      n-2     n-2      2      2

           2  3x3      2*n-3   n-3      3      2

           3  4x4      3*n-5   n-4      4      2

           4  2x2x3    2*n-3   n-2      2      1.5

           5  2x2x4    3*n-4   n-2      2      1

           6  2x4x4    3*n-4   n-4      4      1")

# eventually it would be better to have steps=6 in case of 3x3 (6 seq. design)

# the df2 have to be checked! Not used so far. TODO !

    

  des2 <- textConnection(des)

  designs <- read.table(des2, header=TRUE, sep="", strip.white=TRUE, as.is=TRUE)           

  close(des2)   # without this close() warnings are generated

  

  # names for nicer output of design

  designs$name[designs$no==0] <- "2 parallel groups"

  designs$name[designs$no %in% c(1,2,3)] <- 

      paste(designs$design[designs$no %in% c(1,2,3)],"crossover")

  designs$name[designs$no %in% c(4,5,6)] <- 

      paste(designs$design[designs$no %in% c(4,5,6)],"replicate crossover")



  return(designs)



}

#--- return no of design ---

# design: a character string describing the design

.design.no<-function(design)

{

  #take the first word if more then one

  desi <- unlist(strsplit(tolower(design)," "))[1]


  des <- known.designs()

  i   <- match(desi, des$design)

  if (!is.na(i)) return(des$no[i]) else return(NA)

}

#--- return design constant ---

.design.bk<-function(design.no)

{

  des <- known.designs()

  return(des$bk[des$no==design.no])

}

#--- return design degrees of freedom ---

# as expression

.design.df<-function(design.no, nvar="n")

{

  des <- known.designs()

  df  <- (des$df[des$no==design.no])

  #replace with calling variable name

  df  <- gsub('n',nvar,df)

  e   <- parse(text=df, srcfile=NULL)

  return(as.expression(e))

}

#--- return step for sample size search ---

.design.step<-function(design.no)

{

  des <- known.designs()

  return (des$steps[des$no==design.no])

}

#--- return step for sample size search ---

.design.name<-function(design.no)

{

  des <- known.designs()

  return (des$name[des$no==design.no])

}

Continuation follows.

Again: Use it on your own risk! The author does not take any responsibility for
damage of your computer, your career or of the world :-D

—
Regards,

Detlew

d_labes
★★★

Berlin, Germany,
2009-11-23 15:50
(5687 d 07:52 ago)

@ d_labes
Posting: # 4376
Views: 12,160

Exact power and sample size, Part III

Post reply

Dear All!

Here comes part III of the story. Puzzling all the helpers together we obtain:


# ----- helper functions for sampleN.TOST ---------------------------

# --------------------------------------------------------------------

# Sample size for a desired power, large sample approx.

# bk = design constant, see known.designs()

.sampleN0 <- function(alpha=0.05, targetpower=0.8, theta1, theta2, diffm, se, 

                     steps=2, bk=2)

{

    z1 <- qnorm(1-alpha)

    if (abs(diffm)>0.0001) z2 <- qnorm(targetpower) 

      else z2 <- qnorm(1-(1-targetpower)/2)

    n01<-(bk/2)*((z1+z2)*(se*sqrt(2)/(diffm-theta1)))^2;

    n02<-(bk/2)*((z1+z2)*(se*sqrt(2)/(diffm-theta2)))^2;

    n0 <- max(n01,n02)

    n0 <- ceiling(n0)

    #make an even multiple of step (=2 in case of 2x2 cross-over)

    n0 <- steps*trunc(n0/steps)

    if (n0<4) n0 <- 4   # minimum sample size

    

    return(n0)

}



# --------------------------------------------------------------------------

# Power of two-one-sided-t-tests using OwensQ or approx. using non-central t

# (this is a wrapper to .power.TOST(...) and .approx.power.TOST(...))

# In case of logscale=TRUE give ldiff, ltheata1 and ltheta2 as ratios

# f.i. ldiff=0.95, ltheta1=0.8, ltheta2=1.25

# In case of logscale=FALSE give ldiff, ltheata1 and ltheta2 as 

# difference to 1, f.i. ldiff=0.05 (5% difference) ltheata1=-0.2, 

# ltheta2=0.2 (20% equiv. margins)

# CV is always the coefficient of variation but as ratio, not % 



power.TOST <- function(alpha=0.05, logscale=TRUE, ltheta1=0.8, ltheta2, 

                       ldiff=0.95, CV, n, design="2x2", exact=TRUE)

{

    # design characteristics

    d.no <- .design.no(design)

    if (is.na(d.no)) stop("Err: unknown design!")

    

    dfe <- .design.df(d.no) # degrees of freedom as expression

    bk  <- .design.bk(d.no) # design const.

    

    if (logscale) {

        if (missing(ltheta2)) ltheta2 <- 1/ltheta1

        theta1 <- log(ltheta1)

        theta2 <- log(ltheta2)

        diffm  <- log(ldiff)

        se     <- sqrt(log(1.+CV^2))

    } else {

        if (missing(ltheta2)) ltheta2 <- -ltheta1

        theta1 <- ltheta1

        theta2 <- ltheta2

        diffm  <- ldiff

        se     <- CV

    }

    

    df <- eval(dfe)

    if ( !exact )

      pow <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n, df, bk)

    else

      pow <- .power.TOST(alpha, theta1, theta2, diffm, se, n, df, bk)

    

    return( pow )



}

Again: Use it on your own risk! The author does not take any responsibility for damage of your computer, your career or of the world :-D

—
Regards,

Detlew

d_labes
★★★

Berlin, Germany,
2009-11-23 15:57
(5687 d 07:45 ago)

@ d_labes
Posting: # 4378
Views: 12,280

Exact power and sample size, Part IV

Post reply

Dear All!

Ok, ok, I will quit with this "Eierlegende Wollmilchsau".


# --------------------------------------------------------------------------

# Sample size for a desired power: see known.designs()

# for covered experimental designs

sampleN.TOST <- function(alpha=0.05, targetpower=0.8, logscale=TRUE, 

                         ltheta1=0.8, ltheta2, ldiff=0.95, CV, 

                         design="2x2", exact=TRUE, print=TRUE, details=FALSE)

{

  #number of the design

  d.no <- .design.no(design)

  if (is.na(d.no)) stop("Err: design not known")

  # design charcteristics

  d.name <- .design.name(d.no)

  # get the df for the design as an unevaluated expression

  dfe    <- .design.df(d.no,"Ns")

  # get stepsize for search

  steps  <- .design.step(d.no)

  # get design constant

  bk     <- .design.bk(d.no)



  if (missing(CV)) stop("Err: CV must be given!")



  # print the configuration:

  if (print) {

    cat("\n++++++++ Equivalence test - TOST ++++++++\n")

    cat("         Sample size estimatio\n\n")

    cat("-----------------------------------------\n")

    cat("Study design: ",d.name,"\n")

    if (details) { 

      cat("Design characteristics:\n")

      cat("df = ",gsub("Ns","n",gsub(" ","",as.character(dfe))),

          ", design const = ",bk,", step = ",steps,"\n\n", sep="")

    }     

  }

  # handle the log transformation

  if (logscale) {

    if (missing(ltheta2)) ltheta2=1/ltheta1

    if ( (ldiff<=ltheta1) | (ldiff>=ltheta2) ) {

      cat("Err: ratio not between margins!\n")

      return(NaN)

    }

    theta1 <- log(ltheta1)

    theta2 <- log(ltheta2)

    diffm  <- log(ldiff)

    se     <- sqrt(log(1.+CV^2))

    if (print) cat("log-transformed data (multiplicative model)\n\n")

  } else {

    if (missing(ltheta2)) ltheta2=-ltheta1

    if ( (ldiff<=ltheta1) | (ldiff>=ltheta2) ) {

      cat("Err: diff not between margins!\n")

      return(NaN)

    }

    theta1 <- ltheta1

    theta2 <- ltheta2

    diffm  <- ldiff

    se     <- CV

    if (print) cat("untransformed data (additive model)\n\n")

  }





  if (print) {

    cat("alpha = ",alpha,", target power = ", targetpower,"\n", sep="")

    cat("BE margins        =",ltheta1,"...", ltheta2,"\n")

    if (logscale)

      cat("Null (true) ratio = ",ldiff,",  CV = ",CV,"\n", sep="")

    else

      cat("Null (true) diff. = ",ldiff,",  CV = ",CV,"\n", sep="")

  }



  #start value from large sample approx. (hidden func.)

  Ns  <- .sampleN0(alpha, targetpower, theta1, theta2, diffm, se, steps, bk)

  df  <- eval(dfe)

  if (exact) pow <- .power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)

    else  pow <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)

  if (details) {

    cat("\nSample size search\n")

    if (d.no == 0) cat("(n is sample size per group)\n") 

    cat(" n     power\n")

    cat( Ns," ", formatC(pow, digits=6, format="f"),"\n")

  }

  # --- loop until power >= target power

  iter <- 0

  # iter>50 is emergency brake

  # this is eventually not necessary, depends on quality of sampleN0

  # in experimentation I have seen max of six steps

  if (pow<targetpower) {

    while (pow<targetpower) {

      Ns   <- Ns+steps

      iter <- iter+1

      df   <- eval(dfe)

      if (exact)

        pow  <- .power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)

      else

        pow  <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)

      if (details) cat( Ns," ", formatC(pow, digits=6, format="f"),"\n")

      if (iter>50) break 

    }

  } else {

    while (pow>targetpower) {

      if (Ns<=4) break     # min. sample size

      Ns   <- Ns-steps     # step down if start power is to high

      iter <- iter+1

      df   <- eval(dfe)

      if (exact)

        pow  <- .power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)

      else

        pow  <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)

      if (details) cat( Ns," ", formatC(pow, digits=6),"\n")

      if (iter>50) break  

    }

    if ( pow<targetpower ) {

      Ns   <- Ns+steps     #step up once to have n with pow>=target

      df   <- eval(dfe)

      if (exact)

        pow  <- .power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)

      else

        pow  <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)

    }

  }

  if (print && !details) {

    cat("\nSample size\n")

    if (d.no == 0) cat("(n is sample size per group)\n") 

    cat(" n     power\n")

    cat( Ns," ", formatC(pow, digits=6, format="f"),"\n")

  }

  if (details && print) {

    if (exact) cat("\nExact power calculation with\nOwen's Q functions.\n")

      else cat("\nApproximate power calculation with\nnon-central t-distribution.\n")

  }

  # always print if approx.

  if (print && !exact) 

    cat("\nApproximate power calculation with\nnon-central t-distribution.\n")

  if (print) cat("\n")



  res <- data.frame(design=design, alpha=alpha, CV=CV, theta0=ldiff, 

                    theta1=ltheta1, theta2=ltheta2, n=Ns, power=pow, 

                    targetpower=targetpower)

  names(res) <-c("Design","alpha","CV","theta0","theta1","theta2",

                 "Sample size", "Achieved power", "Target power")



  if (print) 

    return(invisible(res)) 

  else 

    return(res)



}

The function .approx.power.TOST() not shown here is the code using non-central t-approximation for the power.

The whole code is released under the copy left (aka GPL).
Eventually there is someone out there to tying together all this snippets into a package, for christmas :cool:

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2009-11-23 16:13
(5687 d 07:30 ago)

@ d_labes
Posting: # 4379
Views: 12,687

Backslashes in R-code

Post reply

Dear D. Labes,

wow, what a monster! In the meantime I wouldn't call you an [image]

novice any more.
One note: The forum's scripts remove the backslash '\\' (both in the preview and at posting) due to security reasons [PHP-function stripslashes()]. You used them in a couple of lines (mainly in the cat()-statements to force a line-feed in the output). Can you please edit your post and insert double backslashes at the right places - only the first one will be removed, and users may copy-and-paste the code afterwards. THX.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes ★★★ Berlin, Germany, 2009-11-23 16:19 (5687 d 07:23 ago) @ Helmut Posting: # 4380 Views: 12,182	Backslashes eater Post reply
	Dear Helmut, THX for the hint. Meanwhile I had noticed this. But for each click on Preview one backslash is eaten — Regards, Detlew

Helmut
★★★

Vienna, Austria,
2009-11-23 16:28
(5687 d 07:14 ago)

@ d_labes
Posting: # 4381
Views: 12,276

Backslashes eater

Post reply

Dear D. Labes!

❝ […] But for each click on Preview one backslash is eaten :crying:

Yes, I know, sorry. It’s unavoidable, unless a code-wrapper like GeShi is used. Although it’s implemented in the current version 2.1.1 of ‘my little forum’, I opted to stay within the 1.7.6-branch due to performance reasons. We are working on a solution in version 1.7.7.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes