Bioequivalence and Bioavailability Forum

Main page Policy/Terms of Use Abbreviations Latest Posts

 Log-in |  Register |  Search

Back to the forum  Query: 2017-09-23 14:47 CEST (UTC+2h)
 
Helmut
Hero
Homepage
Vienna, Austria,
2007-04-17 17:16

Posting: # 674
Views: 8,080
 

 Flaws in evaluation of parallel designs [R for BE/BA]

Thread locked
Hi everybody!

In this post I wrote:
» You never know when rounding will hit you - and don't dare asking the software vendor for the algorithm...

OK, it’s getting even worse!

According to FDA’s Guidance http://www.fda.gov/cder/guidance/3616fnl.pdf Statistical Approaches Establishing Bioequivalence (2001), Section VI. Statistical Analysis, B. Data Analysis, 1. Average Bioequivalence, d. Parallel Designs:

For parallel designs, the confidence interval for the difference of means in the log scale can be computed using the total between-subject variance. As in the analysis for replicated designs (section VI. B.1.b), equal variances should not be assumed.

(my emphasis)
In other words, naïve pooling of variances – as given in this post – is not appropriate (at least with the FDA).
Degrees of freedom should be corrected by e.g. the Welch-Satterthwaite-Approximation, and the confidence interval calculated accordingly.
Therefore df=21.431 (M$-Excel rounds down to the nearest integer 21) instead of df=22.

Below is a comparison of results (90% CI) for period 1 of the example data set, using different software packages (general purpose statistics, and ‘spezialized’ packages for PK/BE).
┌────────────────────────┬─────────────────┬────────────────┐
│   Program / Method     │    equal var.   │  unequal var.  │
├────────────────────────┼─────────────────┼────────────────┤
│ ‘manual’ (Excel 2000)  │ 63.51 - 110.19% │ 63.4 - 110.25% │
│ R 2.4.1 (2006)         │ 63.51 - 110.19% │ 63.4 - 110.22% │
│ NCSS 2001 (2001)       │ 63.51 - 110.19% │ 63.4 - 110.22% │
│ STATISTICA 5.1H (1997) │ 63.51 - 110.19% │ 63.4 - 110.22% │
│ WinNonlin 5.2 (2007)   │ 63.51 - 110.20% │ not available! │
│ Kinetica 4.4.1 (2007)  │ 63.51 - 110.19% │ not available! │
│ EquivTest/PK (2006)    │ 63.51 - 110.19% │ not available! │
└────────────────────────┴─────────────────┴────────────────┘

For unequal variances in the manual calculation I used the Satterthwaite approximation, R uses the Welch approximation, NCCS: Aspin-Welch, STATISTICA: Milliken-Johnson.
WinNonlin, Kinetica, and EquivTest calculate the CI based on the assumption of equal variances (naïve pooling) only, which:
  • may not hold, and
  • is against FDA’s guideline.
The differences between the ‘classical’ t-test and the Welch-Satterthwaite modified are minimal, but:
  • almost equal variances are observed in the data set, and
  • the data set is balanced.
The former may hold true in many studies (we should not test it, according to the FDA), but the latter is rarely the case in the ‘real world’. Naïve pooling is relatively robust against unequal variances, but rather sensitive to imbalanced data.
If assumption(s) are violated, the ‘classical’ t-test becomes liberal (i.e., the CI is too tight; patient’s risk is higher than the nominal 5%). Whereas for equal group sizes this inflation of the α-risk may be small, the more imbalanced a study gets the more liberal the test becomes.

As an example I multiplied AUC-values of subjects 4-6 (test) by three, and removed subjects 22-24 (test). Now we have an imbalanced data set (Ntest: 9, Nreference: 12) with unequal variances (test: 0.564, ref: 0.129; F-ratio test p 0.0272, modified Levene test p 0.107).

  equal variances: 81.21% - 190.41%
unequal variances: 76.36% - 202.51%


Nothing about the methods used is documented in the ‘specialized’ programs’ manuals and online help.
Making things even worse, WinNonlin has an option field for ‘Degrees of Freedom’, where ‘Satterthwaite’ is checked by default. Anyhow, for parallel groups it should be noted, that exactly the same results are obtained as for the other method ‘Residual’!

Wang H and S-C Chow
A practical approach for comparing means of two groups without equal variance assumption
Statist Med 21, 3137–51 (2002)
online resource


Edit: Link corrected for FDA’s new site. [Helmut]

[image]All the best,
Helmut Schütz 
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
Helmut
Hero
Homepage
Vienna, Austria,
2007-04-18 17:05

@ Helmut
Posting: # 675
Views: 7,012
 

 parallel designs (Welch-Satterthwaite in R)

 
Hi everybody!

I prepared some code in R (v2.4.1) to be used with an example data set (actually data from a 2×2 cross-over, but only data of period 1 are used).
Save the data-file to the bin folder below the main R-application (you may use any other location, but then you will have to tell R where to look for it (in R’s GUI: File > Change directory...).
  disply <- function() {
    cat("Test/Reference with 95% Confidence Limits (90% CI)", fill = TRUE)
      round(tbldiff, 4) }
# read in data
  resp <- read.csv("24par.txt", header = T)
# calculate natural logs
  resp$logAUC <- log(resp$AUC)
  relevel(resp$treatment, ref = "Reference" )
# simple plot, period 1 data only
  def.par <- par(no.readonly = TRUE) # save default
  layout(matrix(c(1,2), byrow = FALSE, ncol=2 ))
  plot(AUC ~ treatment, data = resp, subset = (period == 1), log = "y" )
  plot(logAUC ~ treatment, data = resp, subset = (period == 1 ))
  par(def.par)                       # reset to default
# two sample t-test, equal variances; NOT recommended (anticonservative)!
  result <- t.test(logAUC ~ treatment,
      data = resp,
      subset = (period == 1),
      var.equal = TRUE,
      conf.level = 0.90)
# original output in log-domain
  result
# extract results from list and presentation in untransformed domain
  tbldiff <- matrix(
      c(as.numeric(exp(diff(result$estimate))),
      sort(as.numeric(exp(-result$conf.int)))),
      byrow = TRUE, nrow = 1)
  dimnames(tbldiff) <- list("Ratio", c("Point Estimate", "Lower CL", "Upper CL" ))
  disply()

# two sample t-test (Welch-Satterthwaite), unequal variances
  result <- t.test(logAUC ~ treatment,
    data = resp,
    subset = (period == 1),
    var.equal = FALSE,  # note: This is the default in R!
    conf.level = 0.90)
  result
  tbldiff <- matrix(
    c(as.numeric(exp(diff(result$estimate))),
    sort(as.numeric(exp(-result$conf.int)))),
    byrow = TRUE, nrow = 1)
  dimnames(tbldiff) <- list("Ratio", c("Point Estimate", "Lower CL", "Upper CL" ))
  disply()


Results should be:
        Two Sample t-test

data:  logAUC by treatment
t = 1.1123, df = 22, p-value = 0.278
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
 -0.09702467  0.45390930
sample estimates:
mean in group Reference      mean in group Test
               3.562273                3.383831

Test/Reference with 95% Confidence Limits (90% CI)
      Point Estimate Lower CL Upper CL
Ratio         0.8366   0.6351   1.1019


        Welch Two Sample t-test

data:  logAUC by treatment
t = 1.1123, df = 21.431, p-value = 0.2783
alternative hypothesis: true difference in means is not equal to 0
90 percent confidence interval:
 -0.0973465  0.4542311
sample estimates:
mean in group Reference      mean in group Test
               3.562273                3.383831

Test/Reference with 95% Confidence Limits (90% CI)
      Point Estimate Lower CL Upper CL
Ratio         0.8366   0.6349   1.1022


Have fun, and don’t trust in commercial software!
[image]

[image]All the best,
Helmut Schütz 
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
Back to the forum Activity
 Thread view
Bioequivalence and Bioavailability Forum | Admin contact
17,323 Posts in 3,705 Threads, 1,068 registered users;
26 users online (1 registered, 25 guests).

It’s easy to lie with statistics;
it is easier to lie without them.    Frederick Mosteller

The BIOEQUIVALENCE / BIOAVAILABILITY FORUM is hosted by
BEBAC Ing. Helmut Schütz
XHTML/CSS RSS Feed