Bioequivalence and Bioavailability Forum • Geometric mean?

Geometric mean? [Regulatives / Guidelines]

posted by Helmut – Vienna, Austria, 2024-11-11 15:53 (180 d 00:38 ago) – Posting: # 24272
Views: 3,929

Hi Mittyri & BEQool,

❝ While intuitively, you might think that the geometric mean of a set containing 0 should be 0 (since any product involving 0 is 0), the formal definition and intended purpose of the geometric mean don't align well with this interpretation.

Right. The complete^1–3 definition is $$\overline{x}_\text{geom}=\sqrt[n\phantom{i}]{\prod_{i=1}^{n}x_i}=\sqrt[n]{x_1\cdot x_2\cdots x_n}\phantom{m}\color{Red}{\forall x_i\in}\color{Red}{\mathbb{R}^{+}}\tag{1}$$ In other words, the geometric mean must only be calculated for positive real numbers ($\small{x_i>0}$). A correct implementation in software should throw an error if the the data contain zero(s).

❝ Imagine 2 datasets:

❝ c(1,1,1,1,0)

❝ c(0,0,0,0,1)

❝ Then both will have the same geometric mean value. It loses information about the other values and doesn't accurately reflect the "typical value" or "average rate" that the geometric mean aims to represent.

I agree (giving 0 for both sets) but only with the naïve calculation allowing zeros. However, acc. to $\small{(1)}$ it must not be calculated for both sets. Even GEOMEAN() in Excel ‘knows’ that.

[image]

There is no function to calculate the geometric mean in base [image]

but in two packages: psych is happy with zeros (WTF?), whereas EnvStats is not. Try my homebrew:

library(psych) # contains geometric.mean() library(EnvStats) # contains geoMean() gm <- function(x, print = TRUE) { x <- x[!is.na(x)] pos <- sign(x) == 1 msg <- NULL if (sum(TRUE - pos) >= 1) { if (sum(TRUE - pos) == 1) { msg <- paste0("[1 non-positive value]") } else { msg <- paste0("[", sum(TRUE - pos), " non-positive values]") } res <- NA } else { res <- prod(x)^(1 / length(x)) } if (print) { cat("gm() :", res, msg, "\n") } else { return(res) } } gm1 <- function(x, print = TRUE) { x <- x[!is.na(x)] neg <- sign(x) == -1 msg <- NULL if (sum(neg) >=1) { msg <- paste0("[", sum(neg), " negative(s) excluded]") x <- x[!neg] } res <- prod(x)^(1 / length(x)) if (print) { cat("gm1() :", res, msg, "\n") } else { return(res) } } gm2 <- function(x, print = TRUE) { x <- x[!is.na(x)] pos <- sign(x) == 1 msg <- NULL if (sum(TRUE - pos) >= 1) { if (sum(TRUE - pos) == 1) { msg <- paste0("[1 non-positive excluded]") } else { msg <- paste0("[", sum(TRUE - pos), " non-positives excluded]") } x <- x[pos] } res <- prod(x)^(1 / length(x)) if (print) { cat("gm2() :", res, msg, "\n") } else { return(res) } } gm3 <- function(x, print = TRUE) { res <- geometric.mean(x) if (print) { cat("geometric.mean():", res, "\n") } else { return(res) } } gm4 <- function(x, print = TRUE) { res <- geoMean(x) if (print) { cat("geoMean() :", res, "\n") } else { return(res) } } am <- function(x, print = TRUE) { res <- mean(x, rm = TRUE) if (print) { cat("mean() :", res, "\n") } else { return(res) } } set1 <- c(1, 1, 1, 1, 0) set2 <- c(0, 0, 0, 0, 1) gm(set1); gm1(set1); gm3(set1); gm2(set1); gm4(set1); am(set1) gm() : NA [1 non-positive value] gm1() : 0 geometric.mean(): 0 gm2() : 1 [1 non-positive excluded] geoMean() : NA Warning message: In geoMean(x) : Non-positive values in 'x' mean() : 0.8 gm(set2); gm1(set2); gm3(set2); gm2(set2); gm4(set2); am(set2) gm() : NA [4 non-positive values] gm1() : 0 geometric.mean(): 0 gm2() : 1 [4 non-positives excluded] geoMean() : NA Warning message: In geoMean(x) : Non-positive values in 'x' mean() : 0.2

Only function gm() and the function geoMean() of EnvStats get it right.

❝ Regarding geometric mean: kind of tricky taking into account the information above. The median is better, but why don't you want to follow the method mentioned in BE GL at the first place (arithmetic mean)? Yes, it is bad, but at least not worse than others :-D

Well, the geometric mean is the maximum likelihood estimator of the median. I don’t care what’s stated in the GL. I prefer the median over the arithmetic mean given the expected distribution of concentrations. Not normal… :-D

I agree also with what you wrote about the loss of information and why I prefer the median to keep it: $\widetilde{x}\small{\left\{1,1,1,1,0\right\}=1}$ and $\widetilde{x}\small{\left\{0,0,0,0,1\right\}=0}$.

Note that the ICH M13A states:

Multiple baseline endogenous concentrations should be measured from each subject in the time period before administration of the study drug. The time-averaged baseline [… is] subtracted from post-dose concentrations for those subjects in an appropriate manner consistent with the PK properties of the drug. For the time-averaged method, either the mean or median value may be used.

(my emphasis)

PS: I always used the median in my studies. Was never – ever – questioned by any agency.

Netz H. Formeln der Mathematik. München: Hanser; 6^th ed. 1986. p. 18.
Nolan D, Speed T. Stat Labs. Mathematical Statistics Through Applications. New York: Springer; 2001. p. 68.
Sachs S, Hedderich J. Angewandte Statistik. Methodensammlung mit R. Berlin: Springer; 12^th ed. 2006. p. 76.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Baseline correction vs non-baseline correction jag009 2015-09-15 20:29 [Regulatives / Guidelines]
- Baseline correction vs non-baseline correction Lucas 2015-09-16 14:16
  - Baseline correction vs non-baseline correction Relaxation 2015-09-18 10:25
- Baseline correction or supra-therapeutic dose Helmut 2015-09-16 15:16
  - Baseline correction or supra-therapeutic dose Ohlbe 2015-09-16 17:55
    - Baseline profile prior to each period? Helmut 2015-09-16 18:15
    - Baseline correction examples Helmut 2015-09-17 15:44
      - Baseline correction examples Lucas 2015-09-18 14:19
        
        Baseline correction examples Helmut 2015-09-18 17:28
        
        Baseline correction examples Mauricio Sampaio 2016-02-07 03:05
        
        Baseline correction examples Lucas 2016-02-10 11:48
        
        Postmenopausal women Helmut 2016-02-10 12:26
        Baseline correction examples Mauricio Sampaio 2016-02-11 01:38
        Baseline correction examples Dr_Dan 2016-02-11 09:22
        
        Postmenopausal women Mauricio Sampaio 2016-02-12 03:24
        
        Postmenopausal women Dr_Dan 2016-02-15 17:45
        
        Baseline correction examples BEQool 2024-11-11 07:34
        
        Baseline correction GeoMean and Median mittyri 2024-11-11 11:20
        
        Geometric mean?Helmut 2024-11-11 14:53
        
        Geometric mean? BEQool 2024-11-14 08:19
        
        Geometric mean ≈ MLE of median Helmut 2024-11-14 09:54
        
        Geometric mean ≈ MLE of median Mahmoud 2024-11-15 11:26
        
        Bias? Helmut 2024-11-16 08:43
        
        Bias? Mahmoud 2024-11-16 11:12
    - Baseline; another example Helmut 2015-09-18 17:16
      - Baseline; another example nobody 2016-02-12 09:10
        
        Baseline; another example Helmut 2016-02-12 13:37
        
        Baseline; another example nobody 2016-02-12 14:08
        
        Slowly going OT: BE study simulations Helmut 2016-02-12 15:47
        
        Slowly going OT: BE study simulations nobody 2016-02-12 16:15
        
        Religious debate Helmut 2016-02-12 17:04
        
        Religious debate nobody 2016-02-12 17:26
        
        Religious debate ElMaestro 2016-02-13 12:53
        
        Religious debate nobody 2016-02-14 15:23
        
        Slowly going OT: BE study simulations Ohlbe 2016-02-13 13:10
        
        Slowly going OT: BE study simulations Helmut 2016-02-13 15:02
        
        Slowly going OT: BE study simulations lizhao 2016-02-25 17:31
        
        Business as usual Helmut 2016-02-25 17:44
        
        Business as usual lizhao 2016-02-25 17:59