Shuanghe ★★ Spain, 2019-12-17 14:01 (1864 d 22:00 ago) Posting: # 20981 Views: 5,741 |
|
Hi, I just found out something interesting when compare Excel and R function for t distribution when I was calculating p value. I was writing a function for a friend to do some calculation but then decided to transfer it to Excel since my friend doesn't know much about R. When degree of freedom is integer, R and Excel give very similar result but when df is not integer, the difference is noticeable. I don't know if that's kind of bug or my installation of MS office has some problem. Could anyone check it out? Example, options(digits=16) gives result of 0.999428447456598, and Excel T.DIST(4.5, 10, TRUE) gives 0.999428447456598. identical. However, pt(4.5, 10.1) gives 0.9994425014527133 but Excel T.DIST(4.5, 10.1, TRUE) gives 0.999428447456598. The difference is at 5th decimal. Comparing to identical result at 15th decimal point when df is integer, this is a big difference! Obviously, i trust R much more than Excel. however, I'd appreciate if anyone can verify it and maybe offer some explanation if that's not a bug? Thanks. Edit: Forgot to mention that the Excel I'm using is 2013, on Windows 7 Pro machine. — All the best, Shuanghe |
ElMaestro ★★★ Denmark, 2019-12-17 14:49 (1864 d 21:11 ago) @ Shuanghe Posting: # 20982 Views: 4,616 |
|
Hi Shuanghe, ❝ Obviously, i trust R much more than Excel. however, I'd appreciate if anyone can verify it and maybe offer some explanation if that's not a bug? I do not have Excel, but it sounds like an embarrassing implementation. Can you try, just for the fun of it, to ask Excel what it thinks the critical value at df=10.1 is for: a. p=0.9994425014527133 b. p=0.999428447456598 — Pass or fail! ElMaestro |
Shuanghe ★★ Spain, 2019-12-17 15:15 (1864 d 20:46 ago) @ ElMaestro Posting: # 20983 Views: 4,624 |
|
Hi ElMaestro, ❝ I do not have Excel, but it sounds like an embarrassing implementation. I guess that I shouldn't be surprised given its history... ❝ Can you try, just for the fun of it, to ask Excel what it thinks the critical value at df=10.1 is for: ❝ ❝ a. p=0.9994425014527133 ❝ b. p=0.999428447456598 Good idea. T.INV(0.999442501452713, 10.1) = 4.51612251417097 T.INV(0.999428447456598, 10.1) = 4.50000000000000 In fact, I just checked that T.DIST(4.5, df, TRUE) will give exactly the same result for df = 10.1, 10.2, ... 10.9, (increase df but let df < 11). The result only changed when df = 11! Nice work, M$. — All the best, Shuanghe |
mittyri ★★ Russia, 2019-12-17 15:23 (1864 d 20:38 ago) @ Shuanghe Posting: # 20984 Views: 4,604 |
|
Hi Shuanghe, try this =T.DIST(4.5, 10.1,TRUE )-T.DIST(4.5, 10,TRUE) If you want more pain, see here
even in the latest Office 365... — Kind regards, Mittyri |
Helmut ★★★ Vienna, Austria, 2019-12-17 16:00 (1864 d 20:01 ago) @ mittyri Posting: # 20985 Views: 4,652 |
|
Hi all, known for ages. All (‼) versions of Excel round the degrees of freedom down to the nearest integer.
Quoting Martin: Never never never never use Excel. Not even for calculation of arithmetic means. Congratulations to M$ for making a continuous function discrete. I had fun comparing papers by regulators based on Luther Gwaza’s Excel-Sheet for adjusted indirect comparisons with my R-package. Satterthwaite’s degrees of freedom are practically never* integers. Remember this goody of the EMA? Results obtained by alternative, validated statistical programs are also acceptable except spreadsheets because outputs of spreadsheets are not suitable for secondary assessment.
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Shuanghe ★★ Spain, 2019-12-17 16:41 (1864 d 19:20 ago) @ Helmut Posting: # 20986 Views: 4,586 |
|
Thanks to all! I knew that many years ago we had some discussion about some "features" in Excel (though I didn't recall it's the t distribution) but that's like 7, 8 years ago so I thought M$ fixed the problems. How can they screwed up like this? ... Anyway, I'll tel my friend use my R function instead. — All the best, Shuanghe |
Helmut ★★★ Vienna, Austria, 2019-12-17 17:00 (1864 d 19:01 ago) @ Shuanghe Posting: # 20987 Views: 4,637 |
|
Hi Shuanghe, ❝ I knew that many years ago we had some discussion about some "features" in Excel (though I didn't recall it's the t distribution)… It was the function TINV(alpha, df) in all versions up to 2003. In order to get the correct value people had to use the workaround TINV(2*alpha, df) . M$ could not change the implementation because it would break all existing spreadsheets with the workaround in place. Hence in 2007 the new function T.INV(alpha, df) was introduced. However, the degrees of freedom are still flawed.Of course, the same crap in CHI.INV() , F.INV() .❝ How can they screwed up like this? ... Too big to fail? A refund for defective software might be nice, except it would bankrupt the entire software industry in the first year. Andrew S. Tanenbaum — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |