To round or not to round… [Software]

posted by Helmut Homepage – Vienna, Austria, 2019-07-20 14:53 (1773 d 07:02 ago) – Posting: # 20412
Views: 14,656

Hi ElMaestro,

❝ ❝ Haha, I know this game and therefore, always round before the comparison. ;-)

❝ or never bring yourself into a situation where you need to compare floats.

Did I tell to how many significant digits I round? :-D

❝ I've taken the worst way out and written functions that check if x is simar to y plus/minus a wee faction like 1e-8 or something.

I don’t know which kind of measurements you are comparing. Sumfink ultra-precise (according to the International Bureau of Weights and Measures: t ±5×10–16, l ±2.1×10–11)?

Zizou (I guess) and I were talking about concentrations. AP according to the GLs 20% at the LLOQ and 15% above. Hence, everything reported substantially beyond that is not relevant (shall I call it noise?). What substantially means, depends on the field. Some people go crazy with 6σ, physicists are fine with just 3σ.

x       <- pi
p       <- 15
s       <- 8
mult    <- c(3, 6)
sigma   <- x * p / 100
sigma.m <- sigma * mult
lo      <- x - sigma.m
hi      <- x + sigma.m
df      <- data.frame(x = rep(x, 2), prec = p,
                      sigma = sigma, mult = mult,
                      sigma.m = sigma.m,
                      lo = lo,  hi = hi)
df      <- signif(df, s)
names(df)[2] <- "prec. (%)"
print(df, row.names = FALSE)

       x prec. (%)     sigma mult  sigma.m        lo       hi
3.141593        15 0.4712389    3 1.413717 1.7278760 4.555309
3.141593        15 0.4712389    6 2.827433 0.3141593 5.969026

Do you get the idea? The true value might be somewhere between our self-imposed limits. Therefore, I don’t give a shit whether I get 3.141 or 3.141 592 653 589 793 in an electronic file. However, I insist in a CRC-checksum2 to verify the data-transfer. If I deal with bloody Excel, I round to one decimal beyond what is given in the analytical report3 being aware that it is far beyond the analytical AP. If the data-transfer of analytical results to stats was done electronically in “full numeric precision” (haha), I want to see validation for it.4

❝ It is ugly and clumsy, it works, and it feel everytime like I am suffering defeat.

Well, you are the C-man here. What about printf("%.yg\n", x); where y is the desired number of significant digits? With [image]’s signif():

options("digits" = 15)
x    <- 12345678.999999
y    <-        0.12345678999999
prec <- 8
fmt1 <- "%33.17f"
fmt2 <- paste0("%.", prec, "g")
cat(x, "\n",
    y, "\n",
    sprintf(fmt1, x), "\u2190 fake news\n",
    sprintf(fmt1, y), "\u2190 fake news\n",
    sprintf(fmt1, signif(x, prec)), "\u2190", prec, "significant digits\n",
    sprintf(fmt1, signif(y, prec)), "\u2190", prec, "significant digits\n",
    sprintf(fmt2, x), "\u2190", "directly with", paste0("'", fmt2, "'\n"),
    sprintf(fmt2, y), "\u2190", "directly with", paste0("'", fmt2, "'\n"))

899975955486 ← fake news
               0.12345678999999000 ← fake news
        12345679.00000000000000000 ← 8 significant digits
               0.12345679000000000 ← 8 significant digits
12345679 ← directly with '%.8g'
0.12345679 ← directly with '%.8g'

If you are interested whether rubbish in ≈ rubbish out, ask for a checksum and verify it. Probably better than diving into the murky waters of (likely irrelevant) rounding.

  1. I would be fine with just 3.1 for π. The RE is –1.32%, again much below what is achievable in bioanalytics.
  2. SHA-256 or higher preferred (collisions reported for SHA-1, i.e., different input give the same hash). MD5 is better than nothing.
  3. That’s the only GxP-compliant (dated/signed, released by QUA) document, right? Did you ever see result-tables with 15 significant digits?
  4. Here it gets tricky. These data are different to what is given in the analytical report. Now what? At least I expect a statement about this discrepancy in the protocols (analytical and/or statistical). Regularly I see something like “calculations were performed in full numeric precision”. How could one ever hope to verify that having only the analytical report with rounded results?

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
 Admin contact
23,035 posts in 4,835 threads, 1,644 registered users;
68 visitors (0 registered, 68 guests [including 6 identified bots]).
Forum time: 21:56 CEST (Europe/Vienna)

That which is static and repetitive is boring.
That which is dynamic and random is confusing.
In between lies art.    John Locke

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz