## Read the paper, use the code(s)! [Regulatives / Guidelines]

Hi Imph,

» How can we analyze the treatment effect if we cannot use the analysis of variance?

Why didn’t you bother reading the paper given there and trying one of the codes given there?
Again: Why are you interested in the – irrelevant – treatment effect (a p-value)? Statistical significance  clinical relevance. BE is based on the latter.

All formulas for log-transformed data. Testing for a statistically significant difference $$H_0:\mu_\text{T}-\mu_\text{R}=0\;vs\;H_1:\mu_\text{T}-\mu_\text{R}\neq0\tag{1}$$ was aban­doned more than thirty years ago by the Two One-Sided Tests  \eqalign{ H_{01}:\mu_\text{T}-\mu_\text{R}\leq\theta_1&vs\;H_{11}:\mu_\text{T}-\mu_\text{R}>\theta_1\\ &\text{and}\\ H_{02}:\mu_\text{T}-\mu_\text{R}\geq\theta_2&vs\;H_{12}:\mu_\text{T}-\mu_\text{R}<\theta_2}\tag{2} or the operationally identical – and preferred in guidelines – confidence interval inclusion approach $$H_0:\mu_\text{T}-\mu_\text{R}\ni\left\{\theta_1,\,\theta_2\right\}\;vs\;H_1:\theta_1<\mu_\text{T}-\mu_\text{R}<\theta_2\tag{3}$$ The limits $$\small{\left\{\theta_1,\,\theta_2\right\}}$$ are based on the clinically not relevant difference $$\small{\Delta}$$. In other words, nobody is interested in a p-value. See also this article.

An -script to evaluate the reference data set #6  (unequal groups sizes and unequal variances):

 data   <- read.csv("https://bebac.at/downloads/p6.csv", sep = ",", dec = ".") alpha  <- 0.05 level  <- 1 - 2 * alpha eq     <- FALSE # use Welch’s test; TRUE for the t-test T      <- data[data$treatment == "T", ] R <- data[data$treatment == "R", ] # By default a two-sided confidence interval t      <- t.test(log(T$response), log(R$response), var.equal = eq, conf.level = level) # The rest of the script is only cosmetics mean.T <- mean(log(T$response)) var.T <- var(log(T$response)) mean.R <- mean(log(R$response)) var.R <- var(log(R$response)) PE     <- exp(as.numeric(t$estimate - t$estimate)) CI     <- exp(as.numeric(t$conf.int)) ifelse (eq, txt <- "\n (assuming equal variances)", txt <- "\n (adjusting for unequal group sizes and/or variances)") cat(paste0("Descriptive statistics\nTest group\n Observations : ", nrow(T), sprintf("\n Mean : %+.7f (log-scale)", mean.T), sprintf("\n Variance : %.7f (log-scale)", var.T), sprintf("\n Geometric mean: %.5f", exp(mean.T)), sprintf("\n Geometric CV : %.2f%%", 100 * sqrt(exp(var.T) - 1)), "\nReference group\n Observations : ", nrow(R), sprintf("\n Mean : %+.7f (log-scale)", mean.R), sprintf("\n Variance : %.7f (log-scale)", var.R), sprintf("\n Geometric mean: %.5f", exp(mean.R)), sprintf("\n Geometric CV : %.2f%%\n\n ", 100 * sqrt(exp(var.R) - 1))), trimws(t$method), "based on log-transformed data", txt,     "\n  alternative hypothesis: true difference in means is not equal to 0",     "\n  standard error of the difference =", signif(t$stderr, 8), sprintf("%s %5g, df = %g, p = %5g", "\n t =", t$statistic, t$parameter, t$p.value),     sprintf("\n%s %6.2f%%", "\nPoint estimate         :", 100 * PE),     sprintf("%s%g%% %s %6.2f%% %s %6.2f%%%s", "\n", 100 * level,     "confidence interval:", 100 * CI, "–", 100 * CI, "\n")))

Gives:

Descriptive statistics Test group   Observations  : 24   Mean          : -0.0477730 (log-scale)   Variance      :  0.0611544 (log-scale)   Geometric mean:  0.95335   Geometric CV  : 25.11% Reference group   Observations  : 26   Mean          : -0.0785241 (log-scale)   Variance      :  0.0578189 (log-scale)   Geometric mean:  0.92448   Geometric CV  : 24.40%   Welch Two Sample t-test based on log-transformed data   (adjusting for unequal group sizes and/or variances)   alternative hypothesis: true difference in means is not equal to 0   standard error of the difference = 0.06907899   t = 0.445158, df = 47.429, p = 0.658231 Point estimate         : 103.12% 90% confidence interval:  91.84% – 115.79%

By setting eq <- TRUE in the script you will get at the end (only for comparison):

  Two Sample t-test based on log-transformed data   (assuming equal variances)   alternative hypothesis: true difference in means is not equal to 0   standard error of the difference = 0.06899995   t = 0.445668, df = 48, p = 0.657841 Point estimate         : 103.12% 90% confidence interval:  91.85% – 115.78%

Results are similar because there is no large difference in group sizes and variances are similar as well. How­ever, the conventional t-test is liberal (anticonservative, inflated patient’s risk) because its confidence interval is always narrower than the one of Welch’s test.

If you have extremely different sample sizes and variances, the outcome may differ substantially.
The reference data set #7  is actually a complete failure with a 90% CI of 97.38–138.5% (Welch’s test: 201.164 degrees of freedom, standard error of the difference 0.106596) but almost passes BE with 106.86–126.2% (t-test: 1198 df, SE 0.0505922).
Why is the CI obtained by the t-test narrower than by Welch’s test though the point estimates are equal? We have more degrees of freedom (≈ 6×) and the standard error of the difference is smaller (≈ ½).

1. Schuirmann DJ. A comparison of the Two One-Sided Tests Procedure and the Power Approach for Assessing the Equivalence of Average Bioavailability. J Pharmacokin Biopharm. 1987; 15(6): 657–80. doi:10.1007/BF01068419.
2. Bolton S, Bon C. Statistical consideration: alternate designs and approaches for bioequivalence assessments. In: Kanfer I, Shargel L, editors. Generic Drug Product Development. Bioequivalence Issues. New York: Informa Healthcare; 2008. p. 123–141.
3. $$\small{T:n=\text{1,000},\,\widehat{x}_{\text{geom}}=1.20583,\,\widehat{CV}_{\text{geom}}=\phantom{2}25.15\%}$$
$$\small{R:n=\phantom{1,}200,\,\widehat{x}_{\text{geom}}=1.03825,\,\widehat{CV}_{\text{geom}}=293.02\%}$$

Dif-tor heh smusma 🖖 Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes  Ing. Helmut Schütz 