Geometric mean? [Regulatives / Guidelines]
❝ While intuitively, you might think that the geometric mean of a set containing 0 should be 0 (since any product involving 0 is 0), the formal definition and intended purpose of the geometric mean don't align well with this interpretation.
Right. The complete1–3 definition is $$\overline{x}_\text{geom}=\sqrt[n\phantom{i}]{\prod_{i=1}^{n}x_i}=\sqrt[n]{x_1\cdot x_2\cdots x_n}\phantom{m}\color{Red}{\forall x_i\in}\color{Red}{\mathbb{R}^{+}}\tag{1}$$ In other words, the geometric mean must only be calculated for positive real numbers (\(\small{x_i>0}\)). A correct implementation in software should throw an error if the the data contain zero(s).
❝ Imagine 2 datasets:
❝ c(1,1,1,1,0)
❝ c(0,0,0,0,1)
❝ Then both will have the same geometric mean value. It loses information about the other values and doesn't accurately reflect the "typical value" or "average rate" that the geometric mean aims to represent.
GEOMEAN()
in Excel ‘knows’ that.There is no function to calculate the geometric mean in base but in two packages: psych
is happy with zeros (WTF?), whereas EnvStats
is not. Try my homebrew:library(psych) # contains geometric.mean()
library(EnvStats) # contains geoMean()
gm <- function(x, print = TRUE) {
x <- x[!is.na(x)]
pos <- sign(x) == 1
msg <- NULL
if (sum(TRUE - pos) >= 1) {
if (sum(TRUE - pos) == 1) {
msg <- paste0("[1 non-positive value]")
} else {
msg <- paste0("[", sum(TRUE - pos), " non-positive values]")
}
res <- NA
} else {
res <- prod(x)^(1 / length(x))
}
if (print) {
cat("gm() :", res, msg, "\n")
} else {
return(res)
}
}
gm1 <- function(x, print = TRUE) {
x <- x[!is.na(x)]
neg <- sign(x) == -1
msg <- NULL
if (sum(neg) >=1) {
msg <- paste0("[", sum(neg), " negative(s) excluded]")
x <- x[!neg]
}
res <- prod(x)^(1 / length(x))
if (print) {
cat("gm1() :", res, msg, "\n")
} else {
return(res)
}
}
gm2 <- function(x, print = TRUE) {
x <- x[!is.na(x)]
pos <- sign(x) == 1
msg <- NULL
if (sum(TRUE - pos) >= 1) {
if (sum(TRUE - pos) == 1) {
msg <- paste0("[1 non-positive excluded]")
} else {
msg <- paste0("[", sum(TRUE - pos), " non-positives excluded]")
}
x <- x[pos]
}
res <- prod(x)^(1 / length(x))
if (print) {
cat("gm2() :", res, msg, "\n")
} else {
return(res)
}
}
gm3 <- function(x, print = TRUE) {
res <- geometric.mean(x)
if (print) {
cat("geometric.mean():", res, "\n")
} else {
return(res)
}
}
gm4 <- function(x, print = TRUE) {
res <- geoMean(x)
if (print) {
cat("geoMean() :", res, "\n")
} else {
return(res)
}
}
am <- function(x, print = TRUE) {
res <- mean(x, rm = TRUE)
if (print) {
cat("mean() :", res, "\n")
} else {
return(res)
}
}
set1 <- c(1, 1, 1, 1, 0)
set2 <- c(0, 0, 0, 0, 1)
gm(set1); gm1(set1); gm3(set1); gm2(set1); gm4(set1); am(set1)
gm() : NA [1 non-positive value]
gm1() : 0
geometric.mean(): 0
gm2() : 1 [1 non-positive excluded]
geoMean() : NA
Warning message:
In geoMean(x) : Non-positive values in 'x'
mean() : 0.8
gm(set2); gm1(set2); gm3(set2); gm2(set2); gm4(set2); am(set2)
gm() : NA [4 non-positive values]
gm1() : 0
geometric.mean(): 0
gm2() : 1 [4 non-positives excluded]
geoMean() : NA
Warning message:
In geoMean(x) : Non-positive values in 'x'
mean() : 0.2
gm()
and the function geoMean()
of EnvStats
get it right.❝ Regarding geometric mean: kind of tricky taking into account the information above. The median is better, but why don't you want to follow the method mentioned in BE GL at the first place (arithmetic mean)? Yes, it is bad, but at least not worse than others
I agree also with what you wrote about the loss of information and why I prefer the median to keep it: \(\widetilde{x}\small{\left\{1,1,1,1,0\right\}=1}\) and \(\widetilde{x}\small{\left\{0,0,0,0,1\right\}=0}\).
Note that the ICH M13A states:
Multiple baseline endogenous concentrations should be measured from each subject in the time period before administration of the study drug. The time-averaged baseline [… is] subtracted from post-dose concentrations for those subjects in an appropriate manner consistent with the PK properties of the drug. For the time-averaged method, either the mean or median value may be used.
(my emphasis)PS: I always used the median in my studies. Was never – ever – questioned by any agency.
- Netz H. Formeln der Mathematik. München: Hanser; 6th ed. 1986. p. 18.
- Nolan D, Speed T. Stat Labs. Mathematical Statistics Through Applications. New York: Springer; 2001. p. 68.
- Sachs S, Hedderich J. Angewandte Statistik. Methodensammlung mit R. Berlin: Springer; 12th ed. 2006. p. 76.
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- Baseline correction vs non-baseline correction jag009 2015-09-15 20:29 [Regulatives / Guidelines]
- Baseline correction vs non-baseline correction Lucas 2015-09-16 14:16
- Baseline correction vs non-baseline correction Relaxation 2015-09-18 10:25
- Baseline correction or supra-therapeutic dose Helmut 2015-09-16 15:16
- Baseline correction or supra-therapeutic dose Ohlbe 2015-09-16 17:55
- Baseline profile prior to each period? Helmut 2015-09-16 18:15
- Baseline correction examples Helmut 2015-09-17 15:44
- Baseline correction examples Lucas 2015-09-18 14:19
- Baseline correction examples Helmut 2015-09-18 17:28
- Baseline correction examples Mauricio Sampaio 2016-02-07 03:05
- Baseline correction examples Lucas 2016-02-10 11:48
- Postmenopausal women Helmut 2016-02-10 12:26
- Baseline correction examples Mauricio Sampaio 2016-02-11 01:38
- Baseline correction examples Dr_Dan 2016-02-11 09:22
- Postmenopausal women Mauricio Sampaio 2016-02-12 03:24
- Postmenopausal women Dr_Dan 2016-02-15 17:45
- Postmenopausal women Mauricio Sampaio 2016-02-12 03:24
- Baseline correction examples Lucas 2016-02-10 11:48
- Baseline correction examples BEQool 2024-11-11 07:34
- Baseline correction GeoMean and Median mittyri 2024-11-11 11:20
- Geometric mean?Helmut 2024-11-11 14:53
- Geometric mean? BEQool 2024-11-14 08:19
- Geometric mean ≈ MLE of median Helmut 2024-11-14 09:54
- Geometric mean ≈ MLE of median Mahmoud 2024-11-15 11:26
- Geometric mean ≈ MLE of median Helmut 2024-11-14 09:54
- Geometric mean? BEQool 2024-11-14 08:19
- Geometric mean?Helmut 2024-11-11 14:53
- Baseline correction GeoMean and Median mittyri 2024-11-11 11:20
- Baseline correction examples Mauricio Sampaio 2016-02-07 03:05
- Baseline correction examples Helmut 2015-09-18 17:28
- Baseline correction examples Lucas 2015-09-18 14:19
- Baseline; another example Helmut 2015-09-18 17:16
- Baseline; another example nobody 2016-02-12 09:10
- Baseline; another example Helmut 2016-02-12 13:37
- Baseline; another example nobody 2016-02-12 14:08
- Slowly going OT: BE study simulations Helmut 2016-02-12 15:47
- Slowly going OT: BE study simulations nobody 2016-02-12 16:15
- Religious debate Helmut 2016-02-12 17:04
- Religious debate nobody 2016-02-12 17:26
- Religious debate ElMaestro 2016-02-13 12:53
- Religious debate nobody 2016-02-14 15:23
- Religious debate ElMaestro 2016-02-13 12:53
- Religious debate nobody 2016-02-12 17:26
- Religious debate Helmut 2016-02-12 17:04
- Slowly going OT: BE study simulations Ohlbe 2016-02-13 13:10
- Slowly going OT: BE study simulations Helmut 2016-02-13 15:02
- Slowly going OT: BE study simulations lizhao 2016-02-25 17:31
- Business as usual Helmut 2016-02-25 17:44
- Business as usual lizhao 2016-02-25 17:59
- Business as usual Helmut 2016-02-25 17:44
- Slowly going OT: BE study simulations lizhao 2016-02-25 17:31
- Slowly going OT: BE study simulations Helmut 2016-02-13 15:02
- Slowly going OT: BE study simulations nobody 2016-02-12 16:15
- Slowly going OT: BE study simulations Helmut 2016-02-12 15:47
- Baseline; another example nobody 2016-02-12 14:08
- Baseline; another example Helmut 2016-02-12 13:37
- Baseline; another example nobody 2016-02-12 09:10
- Baseline correction or supra-therapeutic dose Ohlbe 2015-09-16 17:55
- Baseline correction vs non-baseline correction Lucas 2015-09-16 14:16