ElMaestro ★★★ Denmark, 2019-07-18 15:43 (1966 d 11:20 ago) Posting: # 20388 Views: 24,585 |
|
Hi all, does anyone have recent examples of spreadsheet functions truly misbehaving with real-life realistic data? I am not talking about for example the cases of standard deviations or variances that went funky when the application was fed extreme data ranges, or similar waterboarding of software, which was debated 10 years ago and which led regulators to conclude they were the only ones who could be trusted to make proper use of such pernicious creations. Realistic. Data. Only. Please. I will leave it to the reader to quantitatively define what "realistic" and "recent" actually mean and I reserve the right to challenge the definitions, obviously Of course I ask because I never thought spreadsheets were or are as bad as their reputation. The users, on the other hand, are every bit as bad as their reputation. This certainly includes myself. Most likely I am the worst. But that will be the topic of another thread — Pass or fail! ElMaestro |
Ohlbe ★★★ France, 2019-07-18 16:32 (1966 d 10:32 ago) @ ElMaestro Posting: # 20389 Views: 22,648 |
|
Dear ElMaestro, ❝ does anyone have recent examples of spreadsheet functions truly misbehaving with real-life realistic data? Nope. I have examples of users using the wrong functions, or unlocked spreadsheets getting modified at some point, but that's about all. The problem is most often somewhere between the chair and the keyboard: wetware rather than software. Well, there are some "features" of Excel you need to be aware of when using it for stats, of course. Already mentioned on the forum. ❝ [...] which led regulators to conclude they were the only ones who could be trusted to make proper use of such pernicious creations. Maybe their coding skills are not any better than mine*, and they consider that they will always trust whatever they do with a spreadsheet more than anything they might obtain with R ? * sharing with the forum some experience already discussed by email with ElMaestro a couple of years ago: I installed R on my home computer. With the help of some "R for R-seholes" tutorials, it took me something like 4 hours to manage to open a simple set of data and make a simple plot out of it . I spent a few more hours trying to have the colour of the plot change based on simple conditions (red if under a threshold value, green above it). I was even waking up at night with ideas on new ways to test. I never succeeded. Never used R again. I would never trust any of my results anyway. — Regards Ohlbe |
Helmut ★★★ Vienna, Austria, 2019-07-18 18:12 (1966 d 08:52 ago) @ Ohlbe Posting: # 20390 Views: 22,810 |
|
Dear Ohlbe ❝ ❝ [...] which led regulators to conclude they were the only ones who could be trusted to make proper use of such pernicious creations. ❝ ❝ Maybe their coding skills are not any better than mine*, and they consider that they will always trust whatever they do with a spreadsheet more than anything they might obtain with R ? There is a series of papers about indirect adjusted comparisons in BE published by regulators. All done in Excel. ❝ * […] I spent a few more hours trying to have the colour of the plot change based on simple conditions (red if under a threshold value, green above it). I was even waking up at night with ideas on new ways to test. I never succeeded. I believe it. is a nasty beast like SAS (© Detlew). Waking up in the middle of the night or – worse – not being able to fall asleep at all is a common side-effect. Since you will never use again, skip this code:
More fancy stuff wanted?
Semitransparent colors in Excel? I don’t think that’s possible. — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Ohlbe ★★★ France, 2019-07-18 19:30 (1966 d 07:34 ago) @ Helmut Posting: # 20391 Views: 22,503 |
|
Dear Helmut, ❝ Since you will never use again, skip this code Thx... I made multiple attempts with "if" and conditions, and got multiple different error messages. ElMaestro gave me some hints, resulting each time in a "oh yes of course" reaction and more error messages... I probably misinterpreted the hints. — Regards Ohlbe |
Helmut ★★★ Vienna, Austria, 2019-07-18 22:26 (1966 d 04:37 ago) @ Ohlbe Posting: # 20396 Views: 22,571 |
|
Dear Ohlbe, ❝ I made multiple attempts with "if" and conditions, and got multiple different error messages. ElMaestro gave me some hints, resulting each time in a "oh yes of course" reaction and more error messages... I probably misinterpreted the hints. Using if() , its relatives, and loop-constructs (for(), while(), repeat() ) in many cases are prone to errors – as you experienced – and can be slooow. If you have a vector of data (say y ) there a various ways to access its values. Example with 25 random integers in the range 1…40:
which() is an overkill, though one can include many conditions which [sic] makes the code more easy to understand. If you have two vectors (say x , y ) like in my last post, you can select values of x depending on which(y = condition) .Let’s check:
What about speed?
fun2() shows its strengths.If one has more conditions any external construct will suck. We have 100 random integers (1…50) and want to get the even ones between 20 and 30 in increasing order.
if() { do this } else { do that } elseif { oops } .— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Ohlbe ★★★ France, 2019-07-19 13:32 (1965 d 13:32 ago) @ Helmut Posting: # 20397 Views: 22,368 |
|
Dear Helmut, ❝ Using ❝ Good luck coding that with a loop and a nested I think that's really what pissed me off. I thought something like if... else... would be very logical and easy. Maybe because these are real words from a language I can understand. But it looks like R and I are following different kinds of logics. When you learn a new language and start speaking it with somebody, that person will usually show some goodwill, ignore grammatical mistakes and try and understand what you mean. R made no efforts of any kind, even though what I was trying to achieve was pretty obvious ❝ If you have a vector of data (say Yeah, ElMaestro started using this kind of vocabulary too when we were exchanging email. That got me even more confused. All I learnt at school about vectors is the geometric side of the concept - what Wikipedia apparently calls Euclidian vector. I had some problems understanding what an arrow between two points had to do with what I was trying to achieve. Sorry ElMaestro, we're going quite far from your original question... — Regards Ohlbe |
ElMaestro ★★★ Denmark, 2019-07-19 14:32 (1965 d 12:32 ago) @ Ohlbe Posting: # 20399 Views: 22,889 |
|
Hi Ohlbe, ❝ Sorry ElMaestro, we're going quite far from your original question... That's OK. The occasional drift on this forum always pleases me. As for the logic, it is absolutely !7==!3 what you are saying, programming involves that kind of logic which is entirely bound to ones and zeros and that can be hard to grasp when the reference is our everyday language.If it is any comfort, I can confess I usually can't read Hötzi's code right away. I usually use a lot of time deciphering it, or giving up. This is because syntactically it is condensed. I tend to use much longer lines and bigger functions to achieve the same goals. I can't maintain code otherwise, but this is just me. Vector: Something that carries something, according to Birdsong's Internally Hosted Vocabulary (the platform is walnut-sized). The object of this carriage may be information, like "two steps in this direction and then five steps in the perpendicular direction". Or "Apple and thereafter Syphilis". The origin of the word is from the Illiad by Homer, song 13, in which Thetis says: "Vector bears it in triumph upon his own shoulders.", possibly expressed as such because he was in love?! Or at least this is my belief since it was said in relation to some amour. — Pass or fail! ElMaestro |
Ohlbe ★★★ France, 2019-07-19 15:38 (1965 d 11:25 ago) @ ElMaestro Posting: # 20402 Views: 22,370 |
|
Hi ElMaestro, ❝ The origin of the word is from the Illiad by Homer, song 13, in which Thetis says: "Vector bears it in triumph upon his own shoulders.", possibly expressed as such because he was in love?! Or at least this is my belief since it was said in relation to some amour. But if I understand the text well, this is only applicable to Trojan virus coding... — Regards Ohlbe |
Helmut ★★★ Vienna, Austria, 2019-07-19 15:00 (1965 d 12:04 ago) @ Ohlbe Posting: # 20401 Views: 22,352 |
|
Dear Ohlbe, ❝ I thought something like if... else... would be very logical and easy. Maybe because these are real words from a language I can understand. Perfectly understandable. Of course, these constructs exist in R and we use them all the time. The point is only that direct access of values (with a condition) is simply more efficient. If you don’t have to deal with large data sets, both are fine. Since I deal a lot with simulations (1 mio BE studies with a sometimes high number of subjects / periods / sequences, …) I learned the hard way to avoid the ‘simple’ constructs if ever possible (given, sometimes I know that it could be done but was too stupid). Think about my previous example. For 1 mio calls the direct access is 15times faster than the loop and if() within. It’s not unusual to have nested calls. For two it would by ~230times slower and for three already ~3,500times. Would test your patience.❝ But it looks like R and I are following different kinds of logics. Not necessarily so. See above. ❝ When you learn a new language and start speaking it with somebody, that person will usually show some goodwill, ignore grammatical mistakes and try and understand what you mean. R made no efforts of any kind, even though what I was trying to achieve was pretty obvious I agree. See the subject line. The man-pages of ’s functions were written by statisticians, regularly forgetting non-expert users. Sometimes the response you get is bizarre. Want to know how for() works?
❝ ❝ If you have a vector of data (say ❝ ❝ Yeah, ElMaestro started using this kind of vocabulary too when we were exchanging email. […] All I learnt at school about vectors is the geometric side of the concept - what Wikipedia apparently calls Euclidian vector. Correct. ❝ I had some problems understanding what an arrow between two points had to do with what I was trying to achieve. Absolutely nothing, of course. In a one-dimensional array is meant. The indexing (i.e., where it starts) differs between languages.
What makes so powerful is yet another data type, namely the list. A list can contain any of the other types and a mixture of them (even other lists…).
Elements can be accessed by their name or index.
The double square brackets are mandatory to access the root element of lists. Hence, mylist[4][2, 2] does not work:Error in mylist[4][2, 2] : incorrect number of dimensions — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Shuanghe ★★ Spain, 2019-07-18 20:59 (1966 d 06:05 ago) @ Helmut Posting: # 20392 Views: 22,589 |
|
Dear all ❝ There is a series of papers about indirect adjusted comparisons in BE published by regulators. All done in Excel. As far as I can tell, Excel and R give the same result. I was interested in the method so several weeks back I played with R using data from EPAR (none of the articles gives sample size of the BE study so without going to the EPAR for this additional information you can not reproduce/validate their results at all) and the difference in some comparison is 0.01% due to different rounding method used in Excel and R. However, as Ohlbe said, there are some errors in the articles such as in the latest article (link), table 2 gives wrong data (I compared the original data from EPAR and it's clearly a mistake in article) but the result of the comparison using that data is correct, so this might just a typo. However, some comparisons were reversed, e.g., Table 3 describs generic 2 vs generic 4, and generic 3 vs generic 4, but in fact it were generic 4 vs 3 and 2, ... Apparently, someone typed the data in the wrong cell in excel so confidence interval were all reversed. ❝ Semitransparent colors in Excel? I don’t think that’s possible. Haha, this is where you err... Actually you can do that in Excel (not that you should). At least I can in my 2013. It's ugly, but semi-transparent it is. — All the best, Shuanghe |
Helmut ★★★ Vienna, Austria, 2019-07-18 22:04 (1966 d 05:00 ago) @ Shuanghe Posting: # 20395 Views: 22,547 |
|
Hi Shuanghe, ❝ I was interested in the method so several weeks back I played with R using data from EPAR (none of the articles gives sample size of the BE study so without going to the EPAR for this additional information you can not reproduce/validate their results at all) … Yep. I wrote a package for this stuff. I think they were using Excel cause (a) everybody has it (b) of its flat learning curve and (c) – most important – Luther Gwaza* gave his “template” to the others. ❝ … and the difference in some comparison is 0.01% due to different rounding method used in Excel and R. If you want to reproduce Excel’s wacky “commercial rounding”:
❝ ❝ Semitransparent colors in Excel? I don’t think that’s possible. ❝ ❝ Haha, this is where you err... Actually you can do that in Excel (not that you should). At least I can in my 2013. It's ugly, but semi-transparent it is. Well, my version is Excel 2000. BTW, navigate up to my OP and hit to clear the browser’s cache / reload. Is it possible in Excel to have the color dependent on the value (the lower, the darker)?
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Shuanghe ★★ Spain, 2019-07-19 14:41 (1965 d 12:23 ago) @ Helmut Posting: # 20400 Views: 22,383 |
|
Hi Helmut, ❝ Yep. I wrote a package for this stuff. Well, you're always several steps ahead of me. I just wrote some lousy function that serves the educational purpose (for myself). My output is for 4 methods only (Z-distribution, homo- and hetero-scedastic and pragmatic) since I'm still struggling with Chow and Shao/Liu's methods. By the way, I didn't see the package in your Github page. Any plan to put it there so we can have a peek? ❝ If you want to reproduce Excel’s wacky “commercial rounding”: ❝ ... I doubt I'll very use it but nice to know that it can be done. ❝ Well, my version is Excel 2000. BTW, navigate up to my OP and hit to clear the browser’s cache / reload. Is it possible in Excel to have the color dependent on the value (the lower, the darker)? No way. Now you just want to piss off those Excel lovers ❝
Thanks. I have many articles but I didn't know about the thesis. I'll take a look at it later. ElMaestro, apology for the off-topic posting. — All the best, Shuanghe |
Helmut ★★★ Vienna, Austria, 2019-07-19 16:29 (1965 d 10:34 ago) @ Shuanghe Posting: # 20403 Views: 22,674 |
|
Hi Shuanghe, ❝ ❝ Yep. I wrote a package for this stuff. ❝ Well, you're always several steps ahead of me. Nope. I started on July 11th. ❝ My output is for 4 methods only (Z-distribution, homo- and hetero-scedastic and pragmatic) … Mine only for homo- and heteroscedastic variances (the latter is the package’s default). I think that relying on large sample approximations doesn’t make sense at all. The CI will always be narrower than with the other approaches but why would one go this way? Approximations of degrees of freedom for unequal variances were published in 1946/47 (Satterthwaite, Welch), the t-distribution in 1908 (Gosset a.k.a. Student). Going back more than two than two centuries (Laplace, Poisson, Gauß)? I don’t get it. BTW, I don’t like the pragmatic method. ❝ … since I'm still struggling with Chow and Shao/Liu's methods. You need the raw data, right? ❝ […] I didn't see the package in your Github page. Any plan to put it there so we can have a peek? Paid job – chained by a CDA. If we will publish sumfink, sure. Not my decision. ❝ ❝ If you want to reproduce Excel’s wacky “commercial rounding”: ❝ ❝ ... ❝ I doubt I'll very use it but nice to know that it can be done. I would never ever use it. However, if you want to reproduce sumfink which was rounded in Excel you can only experiment till you get the same in R. ❝ ❝ […] navigate up to my OP and hit to clear the browser’s cache / reload. Is it possible in Excel? ❝ ❝ No way. Now you just want to piss off those Excel lovers Yessir! R is a language and environment for statistical computing and graphics. — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
nobody nothing 2019-07-19 17:53 (1965 d 09:11 ago) @ Helmut Posting: # 20404 Views: 22,296 |
|
....reaaaaalllly boring in Vienna these days, huh? — Kindest regards, nobody |
Helmut ★★★ Vienna, Austria, 2019-07-19 21:32 (1965 d 05:32 ago) @ nobody Posting: # 20406 Views: 22,326 |
|
Hi nobody, ❝ ....reaaaaalllly boring in Vienna these days, huh? OT about is a welcomed excuse. All paid jobs absorb and degrade the mind. Aristotle — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
zizou ★ Plzeň, Czech Republic, 2019-07-20 02:18 (1965 d 00:46 ago) @ Ohlbe Posting: # 20408 Views: 22,221 |
|
Hello everybody and nobody. ❝ Nope. I have examples of users using the wrong functions, or unlocked spreadsheets getting modified at some point, but that's about all. The problem is most often somewhere between the chair and the keyboard: wetware rather than software. ❝ ❝ Well, there are some "features" of Excel you need to be aware of when using it for stats, of course. Already mentioned on the forum. Recently, because of "floating point error" I failed to use spreadsheet for comparisons of differences of two similar values against certain value (my source data were almost all with two decimal places, but some of them were calculated with "full precision" and I wanted to easily check how many of differences are lower than 0.05, unfortunaly there was a case with the error). See following examples:
Nevertheless it is not problem of spreadsheet itself, but only precision issue. In R:
Everyone knows, only smart machines don't. Best regards, zizou |
Helmut ★★★ Vienna, Austria, 2019-07-20 04:12 (1964 d 22:52 ago) @ zizou Posting: # 20409 Views: 22,351 |
|
Hi zizou, ❝ Recently, because of "floating point error" I failed to use spreadsheet for comparisons of differences of two similar values against certain value (my source data were almost all with two decimal places, but some of them were calculated with "full precision" and I wanted to easily check how many of differences are lower than 0.05, unfortunaly there was a case with the error). Haha, I know this game and therefore, always round before the comparison. ❝ Everyone knows, only smart machines don't. Can’t be blamed. That’s a property of floating point arithmetic (see also IEEE 754). Unless you have powers of 2 (dec→bin: 1→1, 2→10, 4→100, 8→1000, …) you will always have precision issues in the last significant digits. Everything beyond the 15th (dec-) significant figure is just noise – even with 64bit software on a 64bit OS. Try this in Excel: Cells A1–A9: 0, 1, … , 8 Cells B1–B9: =2^A1–A9 Cells C2–B9: =B2-B1, B3-B1, … Format to 16 decimal places. What do you get? Little bit deeper into :
GNU Octave 5.1.1.0 (all calculations internally in 64bit double):
Not even software for symbolic mathematics does better. Maxima 5.35.12.1 / wxMaxima 14.12.1:
Only for neds: Rmpfr. Doesn’t help.
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
ElMaestro ★★★ Denmark, 2019-07-20 11:08 (1964 d 15:56 ago) @ Helmut Posting: # 20411 Views: 22,131 |
|
Hi Helmut, ❝ Haha, I know this game and therefore, always round before the comparison. or never bring yourself into a situation where you need to compare floats. I've taken the worst way out and written functions that check if x is simar to y plus/minus a wee faction like 1e-8 or something. It is ugly and clumsy, it works, and it feel everytime like I am suffering defeat. "John Brown's got a little Endian" — Pass or fail! ElMaestro |
Helmut ★★★ Vienna, Austria, 2019-07-20 14:53 (1964 d 12:10 ago) @ ElMaestro Posting: # 20412 Views: 22,652 |
|
Hi ElMaestro, ❝ ❝ Haha, I know this game and therefore, always round before the comparison. ❝ ❝ or never bring yourself into a situation where you need to compare floats. Did I tell to how many significant digits I round? ❝ I've taken the worst way out and written functions that check if x is simar to y plus/minus a wee faction like 1e-8 or something. I don’t know which kind of measurements you are comparing. Sumfink ultra-precise (according to the International Bureau of Weights and Measures: t ±5×10–16, l ±2.1×10–11)? Zizou (I guess) and I were talking about concentrations. AP according to the GLs 20% at the LLOQ and 15% above. Hence, everything reported substantially beyond that is not relevant (shall I call it noise?). What substantially means, depends on the field. Some people go crazy with 6σ, physicists are fine with just 3σ.
❝ It is ugly and clumsy, it works, and it feel everytime like I am suffering defeat. Well, you are the C-man here. What about printf("%.yg\n", x); where y is the desired number of significant digits? With ’s signif() :
If you are interested whether rubbish in ≈ rubbish out, ask for a checksum and verify it. Probably better than diving into the murky waters of (likely irrelevant) rounding.
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
ElMaestro ★★★ Denmark, 2019-07-20 21:59 (1964 d 05:04 ago) @ Helmut Posting: # 20415 Views: 22,484 |
|
Hi Hötzi, ❝ I don’t know which kind of measurements you are comparing. It stinks but there really is no good solution when you have to compare.
(294.1/1000) == 0.2941 One can read ISO standards and manuals about internal float representation (I tried andfell asleep), but at the end of the day if you need to compare you need to compare and hope for the best. Some numbers just cannot be represented accurately in binary. If you accept an xyz fraction of error on the comparison then the solution is straightforward, like Match = function (x,y, relative.tol) Ugly as hell , and will not work well if you get into extreme binary representations. — Pass or fail! ElMaestro |
mittyri ★★ Russia, 2019-07-21 00:43 (1964 d 02:21 ago) @ ElMaestro Posting: # 20416 Views: 22,442 |
|
Hi ElMaestro, ❝ ❝ ❝ Ugly as hell , and will not work well if you get into extreme binary representations. take a look at if (Abs(x - y) <= absTol * Max(1.0f, Abs(x), Abs(y))) the description is here One may consider ULP, pros and cons described here — Kind regards, Mittyri |
Ohlbe ★★★ France, 2020-12-10 19:43 (1455 d 06:21 ago) @ ElMaestro Posting: # 22125 Views: 17,282 |
|
Dear ElMaestro, ❝ does anyone have recent examples of spreadsheet functions truly misbehaving with real-life realistic data? A colleague recently pointed to a strange behaviour in Excel (I have 2013, he has 2016, both give the same result here. Type =0.5-0.4-0.1: you'll get the expected result, 0. Now type =(0.5-0.4-0.1): you'll get -2,77556E-17. What ? Ain't that supposed to be the same calculation? Trying =0=0.5-0.4-0.1: the answer I got was FALSE. WTF, if the answer you get is 0? It's not a matter of number of decimals displayed: I tried to add more to the first result, or to switch to scientific notation, but still got 0. — Regards Ohlbe |
Helmut ★★★ Vienna, Austria, 2020-12-10 20:12 (1455 d 05:52 ago) @ Ohlbe Posting: # 22126 Views: 17,275 |
|
Dear Ohlbe, ❝ A colleague recently pointed to a strange behaviour in Excel (I have 2013, he has 2016, both give the same result here. And I have 2000 (don’t laugh). ❝ Type =0.5-0.4-0.1: you'll get the expected result, 0. Yep. ❝ Now type =(0.5-0.4-0.1): you'll get -2,77556E-17. What ? Ain't that supposed to be the same calculation? In principle, yes. Unless we could inspect the source code by M$, difficult to tell what happens with the brackets. ❝ Trying =0=0.5-0.4-0.1: the answer I got was FALSE. WTF, if the answer you get is 0? It's not a matter of number of decimals displayed: I tried to add more to the first result, or to switch to scientific notation, but still got 0. OK, write in =0.5-0.4-0.1 and in =A1=0
Try in =16-8-4-2-2 , in =(16-8-4-2-2) , and in =A1=B1 . I bet it “works” as expected, since these numbers can be converted to binary digits without error.What about ?
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
ElMaestro ★★★ Denmark, 2020-12-10 20:40 (1455 d 05:24 ago) @ Helmut Posting: # 22127 Views: 17,280 |
|
Hi Ohlbe, ❝ Type =0.5-0.4-0.1: you'll get the expected result, 0. ❝ Now type =(0.5-0.4-0.1): you'll get -2,77556E-17. What ? Ain't that supposed to be the same calculation? Yes it is (or may be) to you and to me, but that isn't how an electronic brain works. ❝ Trying =0=0.5-0.4-0.1: the answer I got was FALSE. WTF, if the answer you get is 0 ? It's not a matter of number of decimals displayed: I tried to add more to the first result, or to switch to scientific notation, but still got 0. In the world of 1's and 0's you cannot represent all decimal numbers that are easily displayed in our common number format using digits of 0 to 9 ("decimal"). Your little equations exemplifies it wonderfully; it is not per se an Excel issue. Here's R: 0==(0.5-0.4-0.1) You can totally represent .5 (a half) in binary: It is actually .1 because it is 2 raised to the power of -1. .11 in binary would be 2 raised to the power of -1 plus 2 raised to the power of -2. And so on. But things are tricky for a figure like 0.1 (one tenth) - in binary you'd probably represent it as .00011001100110011001100110011001100110011001100110011... where the 0011 blocks repeat indefinitely. In computers are data types have a maximum storage space, like 4 bytes or 8 bytes. So somewhere along the repeating 0011 blocks the computer will truncate the figure so it fits in the memory. And that is the source of your rounding issue. Computer scientists know never to compare fractional numbers uncritically. It stinks. The problem is not Excel, but that we have invested computers that use binary when at the same time we like to think of numbers represented with the usual 10 digits ("decimal"). — Pass or fail! ElMaestro |
Ohlbe ★★★ France, 2020-12-10 21:18 (1455 d 04:46 ago) @ ElMaestro Posting: # 22129 Views: 17,288 |
|
Hi ElMaestro, ❝ But things are tricky for a figure like 0.1 (one tenth) - in binary you'd probably represent it as .00011001100110011001100110011001100110011001100110011... where the 0011 blocks repeat indefinitely. ❝ In computers are data types have a maximum storage space, like 4 bytes or 8 bytes. So somewhere along the repeating 0011 blocks the computer will truncate the figure so it fits in the memory. And that is the source of your rounding issue. Mmmm. But then if the problem comes from the nasty 0.1, how comes that if I type =(0.8-0.4-0.2-0.1-0.1) I get a nice 0, without any funky E-17 ? And then if I type =0=(0.8-0.4-0.2-0.1-0.1) , or =0=0.8-0.4-0.2-0.1-0.1 I get a nice TRUE ?— Regards Ohlbe |
ElMaestro ★★★ Denmark, 2020-12-10 22:38 (1455 d 03:26 ago) @ Ohlbe Posting: # 22130 Views: 17,273 |
|
Hi Ohlbe, ❝ Mmmm. But then if the problem comes from the nasty 0.1, how comes that if I type Impossible to tell if this is an effect from Excel or from some degree of streamlining within the CPU. It is fairly trivial (I think) at the CPU level as well as in Excel to check if you are handing objects of the type (x-x) and to substitute those with a zero where they occur. — Pass or fail! ElMaestro |
Ohlbe ★★★ France, 2020-12-10 22:46 (1455 d 03:18 ago) @ ElMaestro Posting: # 22131 Views: 17,257 |
|
Hi ElMaestro, ❝ Impossible to tell if this is an effect from Excel or from some degree of streamlining within the CPU. Same behaviour in :
— Regards Ohlbe |
ElMaestro ★★★ Denmark, 2020-12-10 23:05 (1455 d 02:59 ago) @ Ohlbe Posting: # 22132 Views: 17,234 |
|
Hi Ohlbe, 0==(0.8-0.4-0.2-2*0.1) We (as in us humans) should just switch to binary and we'd never have these discussions. We can still keep the metric system but to keep things easy we may wish to define a millimeter as 1/1024 of a meter, if we wish to impress someone by our ability to do division and multiplication in our heads In a nutshell, try:
a=.11111111111111111111111111111111111111111111111111111111 Edit: two posts merged. [Ohlbe] — Pass or fail! ElMaestro |
Ohlbe ★★★ France, 2020-12-11 00:35 (1455 d 01:28 ago) @ ElMaestro Posting: # 22135 Views: 17,223 |
|
❝ In a nutshell, try: ❝ a=.11111111111111111111111111111111111111111111111111111111 ❝ b=.111111111111111111111111111111111111111111111111111111111 ❝ (a-b)==0 ElMaestro, why do you do this to me? You know I have a natural distrust for many things and I see evil everywhere. How much did my shrink pay you, so that I would continue visiting him for another 10 years? Update: just called my shrink, as I couldn't fall asleep. He reminded me that I had already seen this kind of stuff on this forum before, and had survived. He asked me whether I could think of a single situation where I would require this kind of precision in any calculation, else this could in some way affect my life, whether personal or professional. I had to admit that no. So we agreed he could go back to sleep and would send his bill in the morning. He's a good shrink. A trifle expensive, but good. — Regards Ohlbe |
ElMaestro ★★★ Denmark, 2020-12-11 08:59 (1454 d 17:05 ago) @ Ohlbe Posting: # 22139 Views: 17,175 |
|
Hi Ohlbe, ❝ He's a good shrink. A trifle expensive, but good. Call me next time. I will say "Just forget about it, my friend" and it will only cost you, shall we say, 2700 euros per consultation? Cash payment preferred, but only if the latter been adequately laundered and sanitized (in that order). — Pass or fail! ElMaestro |
Ohlbe ★★★ France, 2020-12-10 21:13 (1455 d 04:51 ago) @ Helmut Posting: # 22128 Views: 17,588 |
|
Dear Helmut, ❝ OK, write in ❝ 1. What do you expect? I expect to read 0. Regarding : I would naturally expect it to return TRUE. But on the other hand, typing =0=0.5-0.4-0.1 returns FALSE, and I would not necessarily expect a two-step calculation to yield a result that differs from a one-step calculation. By the way I checked: =0=5-4-1 returns TRUE, so it is not that Excel just compares 0 to 0.5 and ignores the subsequent signs and figures. ❝ 2. What do you get? returns 0, as expected. Switching to scientific notation gives 0,00E+00. So it doesn't look like we have some hidden decimals there that don't show because of the default display. returns TRUE. Indeed the result I would normally expect. But this means that typing the comparison and the calculation in one cell, or the calculation in one cell and the comparison in another, gives different results. ❝ Try in Yes. But it works also with =5-4-1 , in =(5-4-1) , and in =A1=B1 . So it works with 5, which is not a power of 2?— Regards Ohlbe |
Ohlbe ★★★ France, 2020-12-10 23:11 (1455 d 02:52 ago) @ Ohlbe Posting: # 22133 Views: 17,272 |
|
Hi fellows, Sorry guys, but there are some things going back behind the scene, which I really fail to understand. If I get -2,77556E-17 when I type =(0.5-0.4-0.1) , how comes that I get 0 when I type =(-0.1-0.4+0.5) ? The same figures, just in a different order?If I get FALSE when I type =0=0.5-0.4-0.1 , how comes that I get TRUE when I type =0=-0.4-0.1+0.5 ?And to be fair to Excel, the same happens in :
Are they first calculating -0.4-0.1=-0.5, which is nicely binarily coded, then -0.5+0.5=0? — Regards Ohlbe |
mittyri ★★ Russia, 2020-12-11 01:22 (1455 d 00:42 ago) @ Ohlbe Posting: # 22136 Views: 17,218 |
|
Dear Ohlbe, ❝ If I get -2,77556E-17 when I type The order matters when you go to floating point arithmetic. ❝ If I get FALSE when I type Rightyright. Why did it happen? The root cause is described here Excel does do some additional operations under the hood. ❝ And to be fair to Excel, the same happens in : ❝ ❝
❝ [1] FALSE ❝ ❝ [1] TRUE ❝ Are they first calculating -0.4-0.1=-0.5, which is nicely binarily coded, then -0.5+0.5=0? Yes, take a look at this MS article Both Excel and R are following IEEE 754, not surprising that the conditions are giving the same results. The problem is only in values representation to the user. — Kind regards, Mittyri |
Helmut ★★★ Vienna, Austria, 2020-12-11 01:36 (1455 d 00:27 ago) @ Ohlbe Posting: # 22137 Views: 17,226 |
|
Hi Ohlbe, ❝ … the same happens in : So you installed this goody at last? ❝
❝ [1] FALSE ❝ ❝ [1] TRUE ❝ Are they first calculating -0.4-0.1=-0.5, which is nicely binarily coded, then -0.5+0.5=0? Would have to dive into the lexer of the source of base-. Likely not clever enough anyway. If you want to explore the matter, type ?Syntax , ?Arithmetic , ?Comparison , and their relatives. See also the (in)famous FAQ 7.31. However, the `==` operator should not be used to compare two objects for identity. Otherwise we fall into the binary trap:
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
PharmCat ★ Russia, 2020-12-18 21:53 (1447 d 04:11 ago) @ Helmut Posting: # 22150 Views: 16,913 |
|
Hi all, This is very trivial story... let's take a = 0.1 next float is: julia> nextfloat(0.1) so julia> nextfloat(0.1) - 0.1 float numbers is discrete... you can't use 0.10000000000000001 ... you can use rational numbers: julia> 5//10 - 4//10 - 1//10 == -4//10 - 1//10 + 5//10 or some packages for working with big decimals julia> using Decimals I think in R it can be done... and never use Excel for cacl... edit: or you can do float(big(1//10)) if you just need more precision...edit2: ❝ which is just “noise” you should be very careful with eps() because this is not an indicator of accuracy you can use something like: a = 1e-70 end it will work fine, problems begans whe you do somesing like: 2.0 + 1e-70 , but you can still use 2.0 * 1e-70 without problems ...make sure not to sum numbers with difference in eps() times ... for comparation all.equal(target, current, tolerance = sqrt(.Machine$double.eps)) can be used, but for small numbers you should adjust tolerance |
Helmut ★★★ Vienna, Austria, 2020-12-21 00:27 (1445 d 01:37 ago) @ PharmCat Posting: # 22151 Views: 16,750 |
|
Hi PharmCat, ❝ I think in R it can be done... At least consistently giving not zero.
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
mittyri ★★ Russia, 2020-12-21 14:49 (1444 d 11:14 ago) @ Helmut Posting: # 22152 Views: 16,723 |
|
Hi Helmut, ❝ At least consistently giving not zero. ❝
❝ prec <- 64 ❝ x <- mpfr(0.5, prec) - mpfr(0.4, prec) - mpfr(0.1, prec) ❝ x ❝ 1 'mpfr' number of precision 64 bits ❝ [1] -2.77555756156289135106e-17 ❝ y <- mpfr(-0.4, prec) - mpfr(0.1, prec) + mpfr(0.5, prec) ❝ y ❝ 1 'mpfr' number of precision 64 bits ❝ [1] -2.77555756156289135106e-17 I think the right solution is
install.packages("remotes") Interestingly but I did not find good solution using CRAN — Kind regards, Mittyri |
Helmut ★★★ Vienna, Austria, 2021-01-14 13:53 (1420 d 12:11 ago) @ mittyri Posting: # 22180 Views: 16,308 |
|
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
SDavis ★★ UK, 2021-02-09 13:02 (1394 d 13:02 ago) @ Helmut Posting: # 22206 Views: 15,919 |
|
https://www.bbc.co.uk/news/technology-54423988 UK GOv and its repeated desire to use expensive and incompetent consultants.... — Simon Senior Scientific Trainer, Certara™ [link=https://www.youtube.com/watch?v=xX-yCO5Rzag[/link] https://www.certarauniversity.com/dashboard https://support.certara.com/forums/ |
ElMaestro ★★★ Denmark, 2021-02-09 20:55 (1394 d 05:09 ago) @ SDavis Posting: # 22207 Views: 15,860 |
|
Hihi, thanks for that post SD, I agree this is perhaps not a spreadsheet failure but rather a consultant failure. Generally, I tell my clients they can have any two out of the golden three: Cheap, fast and good. In particular, cheap and fast certainly won't be good. (side note: when they use me as a consultant it won't be good regardless, but that's just an unrelated fact of life ) — Pass or fail! ElMaestro |