Read the paper, use the code(s)! [Regulatives / Guidelines]

posted by Helmut Homepage – Vienna, Austria, 2022-06-26 14:12 (160 d 23:32 ago) – Posting: # 23090
Views: 672

Hi Imph,

❝ How can we analyze the treatment effect if we cannot use the analysis of variance?


Why didn’t you bother reading the paper given there and trying one of the codes given there?
Again: Why are you interested in the – irrelevant – treatment effect (a p-value)? Statistical significance  clinical relevance. BE is solely based on the latter.

All formulas for log-transformed data. Testing for a statistically significant difference $$H_0:\mu_\text{T}-\mu_\text{R}=0\;vs\;H_1:\mu_\text{T}-\mu_\text{R}\neq0\tag{1}$$ was aban­doned more than thirty years ago by the Two One-Sided Tests [1] $$\eqalign{
H_{01}:\mu_\text{T}-\mu_\text{R}\leq\theta_1&vs\;H_{11}:\mu_\text{T}-\mu_\text{R}>\theta_1\\
&\text{and}\\
H_{02}:\mu_\text{T}-\mu_\text{R}\geq\theta_2&vs\;H_{12}:\mu_\text{T}-\mu_\text{R}<\theta_2}\tag{2}$$ or the operationally identical – and preferred in guidelines – confidence interval inclusion approach $$H_0:\mu_\text{T}-\mu_\text{R}\ni\left\{\theta_1,\,\theta_2\right\}\;vs\;H_1:\theta_1<\mu_\text{T}-\mu_\text{R}<\theta_2\tag{3}$$ The limits \(\small{\left\{\theta_1,\,\theta_2\right\}}\) are based on the clinically not relevant difference \(\small{\Delta}\). In other words, nobody is interested in a p-value. See also this article.

An [image]-script to evaluate the reference data set #6 [2] (unequal groups sizes and unequal variances):

data   <- read.csv("https://bebac.at/downloads/p6.csv", sep = ",", dec = ".")
alpha  <- 0.05
level  <- 1 - 2 * alpha
eq     <- FALSE # use Welch’s test; TRUE for the t-test
T      <- data[data$treatment == "T", ]
R      <- data[data$treatment == "R", ]
# By default a two-sided confidence interval
t      <- t.test(log(T$response), log(R$response), var.equal = eq, conf.level = level)
# The rest of the script is only cosmetics
mean.T <- mean(log(T$response))
var.T  <- var(log(T$response))
mean.R <- mean(log(R$response))
var.R  <- var(log(R$response))
PE     <- exp(as.numeric(t$estimate[1] - t$estimate[2]))
CI     <- exp(as.numeric(t$conf.int))
ifelse (eq, txt <- "\n  (assuming equal variances)",
            txt <- "\n  (adjusting for unequal group sizes and/or variances)")
cat(paste0("Descriptive statistics\nTest group\n  Observations  : ", nrow(T),
    sprintf("\n  Mean          : %+.7f (log-scale)", mean.T),
    sprintf("\n  Variance      :  %.7f (log-scale)", var.T),
    sprintf("\n  Geometric mean:  %.5f", exp(mean.T)),
    sprintf("\n  Geometric CV  : %.2f%%", 100 * sqrt(exp(var.T) - 1)),
    "\nReference group\n  Observations  : ", nrow(R),
    sprintf("\n  Mean          : %+.7f (log-scale)", mean.R),
    sprintf("\n  Variance      :  %.7f (log-scale)", var.R),
    sprintf("\n  Geometric mean:  %.5f", exp(mean.R)),
    sprintf("\n  Geometric CV  : %.2f%%\n\n ", 100 * sqrt(exp(var.R) - 1))),
    trimws(t$method), "based on log-transformed data", txt,
    "\n  alternative hypothesis: true difference in means is not equal to 0",
    "\n  standard error of the difference =", signif(t$stderr, 8),
    sprintf("%s %5g, df = %g, p = %5g", "\n  t =",
    t$statistic, t$parameter, t$p.value),
    sprintf("\n%s %6.2f%%", "\nPoint estimate         :", 100 * PE),
    sprintf("%s%g%% %s %6.2f%% %s %6.2f%%%s", "\n", 100 * level,
    "confidence interval:", 100 * CI[1], "–", 100 * CI[2], "\n")))

Gives:

Descriptive statistics
Test group
  Observations  : 24
  Mean          : -0.0477730 (log-scale)
  Variance      :  0.0611544 (log-scale)
  Geometric mean:  0.95335
  Geometric CV  : 25.11%
Reference group
  Observations  : 26
  Mean          : -0.0785241 (log-scale)
  Variance      :  0.0578189 (log-scale)
  Geometric mean:  0.92448
  Geometric CV  : 24.40%

  Welch Two Sample t-test based on log-transformed data
  (adjusting for unequal group sizes and/or variances)
  alternative hypothesis: true difference in means is not equal to 0
  standard error of the difference = 0.06907899
  t = 0.445158, df = 47.429, p = 0.658231

Point estimate         : 103.12%
90% confidence interval:  91.84% – 115.79%


By setting eq <- TRUE in the script you will get at the end (only for comparison):

  Two Sample t-test based on log-transformed data
  (assuming equal variances)
  alternative hypothesis: true difference in means is not equal to 0
  standard error of the difference = 0.06899995
  t = 0.445668, df = 48, p = 0.657841

Point estimate         : 103.12%
90% confidence interval:  91.85% – 115.78%


Results are similar because there is no large difference in group sizes and variances are similar as well. How­ever, the conventional t-test is liberal (anticonservative, inflated patient’s risk) because its confidence interval is always narrower than the one of Welch’s test.

If you have extremely different sample sizes and variances, the outcome may differ substantially.
The reference data set #7 [3] is actually a complete failure with a 90% CI of 97.38–138.5% (Welch’s test: 201.164 degrees of freedom, standard error of the difference 0.106596) but almost passes BE with 106.86–126.2% (t-test: 1198 df, SE 0.0505922).
Why is the CI obtained by the t-test narrower than by Welch’s test though the point estimates are equal? We have more degrees of freedom (≈ 6×) and the standard error of the difference is smaller (≈ ½).


  1. Schuirmann DJ. A comparison of the Two One-Sided Tests Procedure and the Power Approach for Assessing the Equivalence of Average Bioavailability. J Pharmacokin Biopharm. 1987; 15(6): 657–80. doi:10.1007/BF01068419.
  2. Bolton S, Bon C. Statistical consideration: alternate designs and approaches for bioequivalence assessments. In: Kanfer I, Shargel L, editors. Generic Drug Product Development. Bioequivalence Issues. New York: Informa Healthcare; 2008. p. 123–141.
  3. \(\small{T:n=\text{1,000},\,\widehat{x}_{\text{geom}}=1.20583,\,\widehat{CV}_{\text{geom}}=\phantom{2}25.15\%}\)
    \(\small{R:n=\phantom{1,}200,\,\widehat{x}_{\text{geom}}=1.03825,\,\widehat{CV}_{\text{geom}}=293.02\%}\)

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
Activity
 Admin contact
22,428 posts in 4,694 threads, 1,598 registered users;
13 visitors (0 registered, 13 guests [including 7 identified bots]).
Forum time: 12:45 CET (Europe/Vienna)

Operational hectic replaces
intellectual calms.    Alexander Huiskes

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5