d_labes
★★★

Berlin, Germany,
2009-11-23 15:32
(5625 d 22:40 ago)

Posting: # 4374
Views: 14,151
 

 Exact power and sample size, Part I [🇷 for BE/BA]

Dear All!

In the era of 'Appelssstrudel' and Owen's Q-function as implemented in R (see here) R-users don't have the necessity to use approximations but should calculate always exact power and sample size!

Since other designs (parallel group, cross-over with more than 2 formulations, replicate designs) used occasionally in BE studies also rely on the two-one-sided t-tests procedure (TOST) here comes my R-super-code:

# -- The functions of normal-, t-distributions and integrate() --------------
require(stats)
# ---------------------------------------------------------------------------
# Owen's Q-function

OwensQ <- function (nu, t, delta, a, b)
{
   # Observation: for 'really' large df (nu>2000) and large delta/b the 
   # density function is zero over nearly all its range. Q is than 
   # returned sometimes as =0! See documentation (?integrate) for that.
   # example: OwensQ(3000,1.64,-10,0,300) = 1
   #          OwensQ(3000,1.64,-10,0,350) = 5.614476e-12 !
   # Idea: adapt upper and/or lower to account for that

   low <- a; up<- b
   if (nu>=1000){
     # try to shorten the range over that is integrated
     x <- seq(a, b, by=(b-a)/400)
     dens <- .Q.integrand(x, nu, t, delta)
     r <- data.frame(x=x, dens=dens)
     r <- r[r$dens>0,]      if (length(r)>0) {# if any >0
       up <- max(r$x)+(b-a)/600  # only the upper x+step is used
       rm(r,x,dens)
     }
   }
   Qintegral <- integrate(.Q.integrand, lower = low, upper = up, nu=nu, t=t,
                delta = delta, subdivisions = 10000,
                rel.tol = 1.e-10, abs.tol=1.e-10,
                stop.on.error = TRUE, keep.xy = FALSE, aux = NULL)
   Q <- Qintegral[[1]]

   return(Q) 
}
# --------------------------------------------------------------------------
# Integrand of the definit integral in Owen's Q. Used in the call of integrate()
# Not useful alone, I think ? Leading . hides this function
   
.Q.integrand <- function(x, nu, t, delta)
{ #version without for - loop
   lnQconst <- -((nu/2.0)-1.0)*log(2.0) - lgamma(nu/2.)
   dens <- pnorm( t*x/sqrt(nu) - delta, mean = 0, sd = 1, log = FALSE) *
                 exp( (nu-1)*log(x) - 0.5*x^2 + lnQconst )

   return(dens)
}
# --------------------------------------------------------------------------
# 'raw' power function to be used in sampleN.TOST avoiding overhead
# bk is the so called design constant (see later)

.power.TOST <- function(alpha=0.05, theta1, theta2, diffm, se, n, df, bk=2)
{
    tval   <- qt(1 - alpha, df, lower.tail = TRUE, log.p = FALSE)
    delta1 <- (diffm-theta1)/(se*sqrt(bk/n))
    delta2 <- (diffm-theta2)/(se*sqrt(bk/n))
    R      <- (delta1-delta2)*sqrt(df)/(2.*tval)
   
    # to avoid numerical errors in OwensQ implementation
    if (df>10000) {
      # Joulious formula (57) or (67), normal approximation
      p1 <- pnorm( (abs(delta1)-tval), lower.tail = TRUE)
      p2 <- pnorm( (abs(delta2)-tval), lower.tail = TRUE)
      return (p1 + p2 -1)
    }
   
    p1 <- OwensQ(df,  tval, delta1, 0, R)
    p2 <- OwensQ(df, -tval, delta2, 0, R)

    return( p2-p1 )
}

Continuation follows.

Use it on your own risk! The author does not take any responsibility for damage of your computer, your career or of the world :-D .

Regards,

Detlew
d_labes
★★★

Berlin, Germany,
2009-11-23 15:38
(5625 d 22:33 ago)

@ d_labes
Posting: # 4375
Views: 12,075
 

 Exact power and sample size, Part II

Dear All!

Here comes part II of the story:

# a bunch of functions to get the design characteristics
known.designs<-function()
{
# Note: the df for the replicate designs are those without carry-over
# n is the total number in case of cross-over design,
# n is number per group in case of parallel group design
# df2 = degrees of freedom for robust analysis (aka Senn's basic estimator)
# bk is the so-called design constant

  des <- ("no  design    df      df2    steps  bk
           0  parallel 2*(n-1) 2*(n-1)  1      2
           1  2x2      n-2     n-2      2      2
           2  3x3      2*n-3   n-3      3      2
           3  4x4      3*n-5   n-4      4      2
           4  2x2x3    2*n-3   n-2      2      1.5
           5  2x2x4    3*n-4   n-2      2      1
           6  2x4x4    3*n-4   n-4      4      1")
# eventually it would be better to have steps=6 in case of 3x3 (6 seq. design)
# the df2 have to be checked! Not used so far. TODO !

   
  des2 <- textConnection(des)
  designs <- read.table(des2, header=TRUE, sep="", strip.white=TRUE, as.is=TRUE)           
  close(des2)   # without this close() warnings are generated
 
  # names for nicer output of design
  designs$name[designs$no==0] <- "2 parallel groups"
  designs$name[designs$no %in% c(1,2,3)] <-
      paste(designs$design[designs$no %in% c(1,2,3)],"crossover")
  designs$name[designs$no %in% c(4,5,6)] <-
      paste(designs$design[designs$no %in% c(4,5,6)],"replicate crossover")

  return(designs)

}
#--- return no of design ---
# design: a character string describing the design

.design.no<-function(design)
{
  #take the first word if more then one
  desi <- unlist(strsplit(tolower(design)," "))[1]
  des <- known.designs()
  i   <- match(desi, des$design)
  if (!is.na(i)) return(des$no[i]) else return(NA)
}
#--- return design constant ---
.design.bk<-function(design.no)
{
  des <- known.designs()
  return(des$bk[des$no==design.no])
}
#--- return design degrees of freedom ---
# as expression

.design.df<-function(design.no, nvar="n")
{
  des <- known.designs()
  df  <- (des$df[des$no==design.no])
  #replace with calling variable name
  df  <- gsub('n',nvar,df)
  e   <- parse(text=df, srcfile=NULL)
  return(as.expression(e))
}
#--- return step for sample size search ---
.design.step<-function(design.no)
{
  des <- known.designs()
  return (des$steps[des$no==design.no])
}
#--- return step for sample size search ---
.design.name<-function(design.no)
{
  des <- known.designs()
  return (des$name[des$no==design.no])
}

Continuation follows.

Again: Use it on your own risk! The author does not take any responsibility for
damage of your computer, your career or of the world :-D .

Regards,

Detlew
d_labes
★★★

Berlin, Germany,
2009-11-23 15:50
(5625 d 22:21 ago)

@ d_labes
Posting: # 4376
Views: 11,706
 

 Exact power and sample size, Part III

Dear All!

Here comes part III of the story. Puzzling all the helpers together we obtain:

# ----- helper functions for sampleN.TOST ---------------------------
# --------------------------------------------------------------------
# Sample size for a desired power, large sample approx.
# bk = design constant, see known.designs()

.sampleN0 <- function(alpha=0.05, targetpower=0.8, theta1, theta2, diffm, se,
                     steps=2, bk=2)
{
    z1 <- qnorm(1-alpha)
    if (abs(diffm)>0.0001) z2 <- qnorm(targetpower)
      else z2 <- qnorm(1-(1-targetpower)/2)
    n01<-(bk/2)*((z1+z2)*(se*sqrt(2)/(diffm-theta1)))^2;
    n02<-(bk/2)*((z1+z2)*(se*sqrt(2)/(diffm-theta2)))^2;
    n0 <- max(n01,n02)
    n0 <- ceiling(n0)
    #make an even multiple of step (=2 in case of 2x2 cross-over)
    n0 <- steps*trunc(n0/steps)
    if (n0<4) n0 <- 4   # minimum sample size
   
    return(n0)
}

# --------------------------------------------------------------------------
# Power of two-one-sided-t-tests using OwensQ or approx. using non-central t
# (this is a wrapper to .power.TOST(...) and .approx.power.TOST(...))
# In case of logscale=TRUE give ldiff, ltheata1 and ltheta2 as ratios
# f.i. ldiff=0.95, ltheta1=0.8, ltheta2=1.25
# In case of logscale=FALSE give ldiff, ltheata1 and ltheta2 as
# difference to 1, f.i. ldiff=0.05 (5% difference) ltheata1=-0.2,
# ltheta2=0.2 (20% equiv. margins)
# CV is always the coefficient of variation but as ratio, not %


power.TOST <- function(alpha=0.05, logscale=TRUE, ltheta1=0.8, ltheta2,
                       ldiff=0.95, CV, n, design="2x2", exact=TRUE)
{
    # design characteristics
    d.no <- .design.no(design)
    if (is.na(d.no)) stop("Err: unknown design!")
   
    dfe <- .design.df(d.no) # degrees of freedom as expression
    bk  <- .design.bk(d.no) # design const.
   
    if (logscale) {
        if (missing(ltheta2)) ltheta2 <- 1/ltheta1
        theta1 <- log(ltheta1)
        theta2 <- log(ltheta2)
        diffm  <- log(ldiff)
        se     <- sqrt(log(1.+CV^2))
    } else {
        if (missing(ltheta2)) ltheta2 <- -ltheta1
        theta1 <- ltheta1
        theta2 <- ltheta2
        diffm  <- ldiff
        se     <- CV
    }
   
    df <- eval(dfe)
    if ( !exact )
      pow <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n, df, bk)
    else
      pow <- .power.TOST(alpha, theta1, theta2, diffm, se, n, df, bk)
   
    return( pow )

}

Again: Use it on your own risk! The author does not take any responsibility for damage of your computer, your career or of the world :-D .

Regards,

Detlew
d_labes
★★★

Berlin, Germany,
2009-11-23 15:57
(5625 d 22:14 ago)

@ d_labes
Posting: # 4378
Views: 11,810
 

 Exact power and sample size, Part IV

Dear All!

Ok, ok, I will quit with this "Eierlegende Wollmilchsau".
# --------------------------------------------------------------------------
# Sample size for a desired power: see known.designs()
# for covered experimental designs
sampleN.TOST <- function(alpha=0.05, targetpower=0.8, logscale=TRUE,
                         ltheta1=0.8, ltheta2, ldiff=0.95, CV,
                         design="2x2", exact=TRUE, print=TRUE, details=FALSE)
{
  #number of the design
  d.no <- .design.no(design)
  if (is.na(d.no)) stop("Err: design not known")
  # design charcteristics
  d.name <- .design.name(d.no)
  # get the df for the design as an unevaluated expression
  dfe    <- .design.df(d.no,"Ns")
  # get stepsize for search
  steps  <- .design.step(d.no)
  # get design constant
  bk     <- .design.bk(d.no)

  if (missing(CV)) stop("Err: CV must be given!")

  # print the configuration:
  if (print) {
    cat("\n++++++++ Equivalence test - TOST ++++++++\n")
    cat("         Sample size estimatio\n\n")
    cat("-----------------------------------------\n")
    cat("Study design: ",d.name,"\n")
    if (details) {
      cat("Design characteristics:\n")
      cat("df = ",gsub("Ns","n",gsub(" ","",as.character(dfe))),
          ", design const = ",bk,", step = ",steps,"\n\n", sep="")
    }     
  }
  # handle the log transformation
  if (logscale) {
    if (missing(ltheta2)) ltheta2=1/ltheta1
    if ( (ldiff<=ltheta1) | (ldiff>=ltheta2) ) {
      cat("Err: ratio not between margins!\n")
      return(NaN)
    }
    theta1 <- log(ltheta1)
    theta2 <- log(ltheta2)
    diffm  <- log(ldiff)
    se     <- sqrt(log(1.+CV^2))
    if (print) cat("log-transformed data (multiplicative model)\n\n")
  } else {
    if (missing(ltheta2)) ltheta2=-ltheta1
    if ( (ldiff<=ltheta1) | (ldiff>=ltheta2) ) {
      cat("Err: diff not between margins!\n")
      return(NaN)
    }
    theta1 <- ltheta1
    theta2 <- ltheta2
    diffm  <- ldiff
    se     <- CV
    if (print) cat("untransformed data (additive model)\n\n")
  }


  if (print) {
    cat("alpha = ",alpha,", target power = ", targetpower,"\n", sep="")
    cat("BE margins        =",ltheta1,"...", ltheta2,"\n")
    if (logscale)
      cat("Null (true) ratio = ",ldiff,",  CV = ",CV,"\n", sep="")
    else
      cat("Null (true) diff. = ",ldiff,",  CV = ",CV,"\n", sep="")
  }

  #start value from large sample approx. (hidden func.)
  Ns  <- .sampleN0(alpha, targetpower, theta1, theta2, diffm, se, steps, bk)
  df  <- eval(dfe)
  if (exact) pow <- .power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)
    else  pow <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)
  if (details) {
    cat("\nSample size search\n")
    if (d.no == 0) cat("(n is sample size per group)\n")
    cat(" n     power\n")
    cat( Ns," ", formatC(pow, digits=6, format="f"),"\n")
  }
  # --- loop until power >= target power
  iter <- 0
  # iter>50 is emergency brake
  # this is eventually not necessary, depends on quality of sampleN0
  # in experimentation I have seen max of six steps

  if (pow<targetpower) {
    while (pow<targetpower) {
      Ns   <- Ns+steps
      iter <- iter+1
      df   <- eval(dfe)
      if (exact)
        pow  <- .power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)
      else
        pow  <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)
      if (details) cat( Ns," ", formatC(pow, digits=6, format="f"),"\n")
      if (iter>50) break
    }
  } else {
    while (pow>targetpower) {
      if (Ns<=4) break     # min. sample size
      Ns   <- Ns-steps     # step down if start power is to high
      iter <- iter+1
      df   <- eval(dfe)
      if (exact)
        pow  <- .power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)
      else
        pow  <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)
      if (details) cat( Ns," ", formatC(pow, digits=6),"\n")
      if (iter>50) break 
    }
    if ( pow<targetpower ) {
      Ns   <- Ns+steps     #step up once to have n with pow>=target
      df   <- eval(dfe)
      if (exact)
        pow  <- .power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)
      else
        pow  <- .approx.power.TOST(alpha, theta1, theta2, diffm, se, n=Ns, df, bk)
    }
  }
  if (print && !details) {
    cat("\nSample size\n")
    if (d.no == 0) cat("(n is sample size per group)\n")
    cat(" n     power\n")
    cat( Ns," ", formatC(pow, digits=6, format="f"),"\n")
  }
  if (details && print) {
    if (exact) cat("\nExact power calculation with\nOwen's Q functions.\n")
      else cat("\nApproximate power calculation with\nnon-central t-distribution.\n")
  }
  # always print if approx.
  if (print && !exact)
    cat("\nApproximate power calculation with\nnon-central t-distribution.\n")
  if (print) cat("\n")

  res <- data.frame(design=design, alpha=alpha, CV=CV, theta0=ldiff,
                    theta1=ltheta1, theta2=ltheta2, n=Ns, power=pow,
                    targetpower=targetpower)
  names(res) <-c("Design","alpha","CV","theta0","theta1","theta2",
                 "Sample size", "Achieved power", "Target power")

  if (print)
    return(invisible(res))
  else
    return(res)

}

The function .approx.power.TOST() not shown here is the code using non-central t-approximation for the power.

The whole code is released under the copy left (aka GPL).
Eventually there is someone out there to tying together all this snippets into a package, for christmas :cool:.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2009-11-23 16:13
(5625 d 21:59 ago)

@ d_labes
Posting: # 4379
Views: 12,225
 

 Backslashes in R-code

Dear D. Labes,

wow, what a monster! In the meantime I wouldn't call you an [image] novice any more.
One note: The forum's scripts remove the backslash '\\' (both in the preview and at posting) due to security reasons [PHP-function stripslashes()]. You used them in a couple of lines (mainly in the cat()-statements to force a line-feed in the output). Can you please edit your post and insert double backslashes at the right places - only the first one will be removed, and users may copy-and-paste the code afterwards. THX.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2009-11-23 16:19
(5625 d 21:52 ago)

@ Helmut
Posting: # 4380
Views: 11,725
 

 Backslashes eater

Dear Helmut,

THX for the hint. Meanwhile I had noticed this. But for each click on Preview one backslash is eaten :crying:

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2009-11-23 16:28
(5625 d 21:43 ago)

@ d_labes
Posting: # 4381
Views: 11,819
 

 Backslashes eater

Dear D. Labes!

❝ […] But for each click on Preview one backslash is eaten :crying:


Yes, I know, sorry. It’s unavoidable, unless a code-wrapper like GeShi is used. Although it’s implemented in the current version 2.1.1 of ‘my little forum’, I opted to stay within the 1.7.6-branch due to performance reasons. We are working on a solution in version 1.7.7.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
UA Flag
Activity
 Admin contact
23,424 posts in 4,927 threads, 1,699 registered users;
29 visitors (0 registered, 29 guests [including 8 identified bots]).
Forum time: 15:12 CEST (Europe/Vienna)

Do not put your faith in what statistics say until you have carefully
considered what they do not say.    William W. Watt

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5