chbs
●

2009-01-19 10:35
(4471 d 19:40 ago)

Posting: # 3060
Views: 24,427

## Excel sheet usage in bioanalytical calculations [Bioanalytics]

Dear All,

We are using the excel sheet for the calculation of mean, SD and CV during the bioanalytical analysis.

If we compare the same calculation with validated WinNonlin Software, then whether it can serve as supporting data for the excel sheet usage and can we say that the excel sheet is validated because of this comparision.

Or it requires complete validation of the excel sheet?

chbs
martin
★★

Austria,
2009-01-19 11:52
(4471 d 18:23 ago)

@ chbs
Posting: # 3062
Views: 21,762

## do not use excel

dear chbs !

never never never never use excel for any statistical calculations: spreadsheet addiction

best regards

martin
Ohlbe
★★★

France,
2009-01-19 23:54
(4471 d 06:20 ago)

@ martin
Posting: # 3067
Views: 21,494

## do not use excel ?

Dear martin and chbs,

» never never never never use excel for any statistical calculations

Use of Excel has been previously discussed on the forum, for instance here.

Even though it is certainly not the correct tool to use for complex (or even intermediate) stats, I would not be shocked to see it used for descriptive stats (mean, SD, CV), where it is easier to use than a stats package for the basic user (using SAS or R to calculate the mean of 6 values would be like using a sledgehammer to kill a mosquito, and would seriously reduce the number of possible users). However this requires some precautions: validate your spreadsheet by comparing the results with those obtained with another software (or even manually), and then lock and password-protect your spreadsheet, only leaving unlocked the cells in which you will enter your data.

Upgrade to a new version of Excel (or any version change) will require a re-validation.

Regards
Ohlbe
martin
★★

Austria,
2009-01-20 14:52
(4470 d 15:23 ago)

(edited by martin on 2009-01-20 15:12)
@ Ohlbe
Posting: # 3075
Views: 22,141

## Never

Never never never never use Excel. Not even for calculation of arithmetic means:

http://www.cs.uiowa.edu/~jcryer/JSMTalk2001.pdf
http://pages.stern.nyu.edu/~jsimonof/classes/1305/pdf/excelreg.pdf
http://www.agresearch.co.nz/Science/Statistics/exceluse1.htm

best regards

martin

Edit: Capital letters reformatted to bold. Please see here. [Helmut]
Helmut
★★★

Vienna, Austria,
2009-01-20 17:03
(4470 d 13:11 ago)

@ martin
Posting: # 3076
Views: 21,529

## Mathematical wizard

Hi Martin,

I really loved one of the examples from your last reference:
  STDEV(1,2,3) = 1   STDEV(90000001,90000002,90000003) = 0
which looks even more funny if you run this set (in M$-Excel 5.0 and 2000 – the only versions I have handy):  STDEV(0.90000001,0.90000002,0.90000003) = 0 STDEV(0.9000001,0.9000002,0.9000003) = 0.0000001 STDEV(0.900001,0.900002,0.900003) = 0.000001 STDEV(0.90001,0.90002,0.90003) = 0.00001 STDEV(0.9001,0.9002,0.9003) = 0.0001 STDEV(0.901,0.902,0.903) = 0.001 STDEV(0.91,0.92,0.93) = 0.01 STDEV(9.1,9.2,9.3) = 0.1 STDEV(91,92,93) = 1 STDEV(901,902,903) = 1 STDEV(9001,9002,9003) = 1 STDEV(90001,90002,90003) = 1 STDEV(900001,900002,900003) = 1 STDEV(9000001,9000002,9000003) = 1 STDEV(90000001,90000002,90000003) = 0 STDEV(9000000001,9000000002,9000000003) = 0 STDEV(90000000001,90000000002,90000000003) = 0 STDEV(900000000001,900000000002,900000000003) = 0 STDEV(9000000000001,9000000000002,9000000000003) = 131072 STDEV(90000000000001,90000000000002,90000000000003) = 0 STDEV(900000000000001,900000000000002,900000000000003) = 0 R 2.8.1  sd(c(90000001,90000002,90000003)) = 1 STATISTICA 5.1H  St.Dev.(90000001,90000002,90000003) = 1 NCSS 2001  Standard Deviation (90000001,90000002,90000003) = 1 Maxima 5.15.0*  std1([90000001,90000002,90000003]); = 1 GNU Octave 3.0.3  std([90000001,90000002,90000003]) = 1 WinNonlin 5.2.1  STDEV(90000001;90000002;90000003) = 0 As you said, it’s advisable to use statistical software for statistical calculations (surprisingly PK software failed this simple test as well). Since this thread is in the Bioanalytics category: as far as I know Thermo’s Xcalibur software has only the options to store data in its own proprietary format or to export xls-files. But help is on the way. All (?) statistical software are able to import xls-files. • When it comes to numerical precision, it’s nice to have a clever algorithm. Maxima gives a standard deviation of 1 even for a set of {9·101,000+1, 9·101,000+2, 9·101,000+3}… If you don't believe me, run this code in Maxima: load(descriptive)$ std1(([9,9,9])*10^1000+([1,2,3]));

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2009-01-21 11:00
(4469 d 19:15 ago)

@ Helmut
Posting: # 3083
Views: 23,133

## Do not trust in computers

Hi Helmut, hi Martin,

flame warin' a little bit with you.

Never, never, never use any software!
Never, never, never trust any computer!

You can always find numbers which cannot handled by your preferred piece of software on your preferred machine (Remember the Pentium FDIV bug).
Therefore soft- and hardware are not trustworthy and useless in a general sense .

By the way, M$-Excel is an extraordinary software in this way. Regards, Detlew Helmut ★★★ Vienna, Austria, 2009-01-21 13:00 (4469 d 17:15 ago) @ d_labes Posting: # 3091 Views: 21,267 ## Do not trust in computers Dear DLabes! » flame warin' a little bit with you. Yeah, I like that! » By the way, M$-Excel is an extraordinary software in this way.

Definitely. If one has to use it for any reasons, he/she should follow the Ohlbe's suggestions.
BTW, even in M$-Excel it's possible to get the correct answer (if STDEV() is avoided):  | A | B +---------------+----------------------------- 1 | 90000001 | =(A1-A$4)^2   2 | 90000002      | =(A2-A$4)^2 3 | 90000003 | =(A3-A$4)^2   4 | =MEAN(A1:A3)  | =SUM(B1:B3)   5 | =STDEV(A1:A3) | =SQRT(1/(COUNT(A1:A3)-1)*B4)
gives
    |       A       |       B     +---------------+-----------------------------   1 | 90000001      | 1   2 | 90000002      | 0   3 | 90000003      | 1   4 | 90000002      | 2   5 | 0             | 1

There are even better speadsheets around. OpenOffice Calc returns 1 if STDEV() is used, even with the oldest version I have on one of my machines, namely v2.0 from 2005.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Ohlbe
★★★

France,
2009-01-21 14:03
(4469 d 16:12 ago)

(edited by Ohlbe on 2009-01-21 23:17)
@ Helmut
Posting: # 3098
Views: 21,170

## Do not trust in computers

Dear HS,

Amazing. I fail to reproduce the error in SD calculation in the example you give, using the French version of Excel 2003 I have here. I get the correct value of 1. I'll check with another version of Excel (also French) at home tonight.

Edit: error reproduced at home with the French version of Excel 2000... Shit !

Does Microsoft also call this a "feature", rather than a bug ?

And considering the fact that the bug appears with some sets of data, but not others, validating a spreadsheet by comparison with another software or manual calculation would not be sufficient.

Regards
Ohlbe
d_labes
★★★

Berlin, Germany,
2009-01-21 15:42
(4469 d 14:33 ago)

(edited by d_labes on 2009-01-21 16:06)
@ Ohlbe
Posting: # 3099
Views: 21,077

## Do not trust in Excelence

Dear HS, dear Ohlbe,

seems we are going to validate M$Excel, which we will never, never, never use for any statistics? Amazing! To add my two coins: Excel 2003 SP3 (german), Windows XP professional on a DELL Pentium 4 (3.0 GHz) machine with 1 GB RAM, chip set graphics .........  | A +--------------- 1 | 90000001 2 | 90000002 3 | 90000003 4 | =MITTELWERT(A1:A3) 5 | =STABW(A1:A3) gives  | A +--------------- 1 | 90000001 2 | 90000002 3 | 90000003 4 | 90000002 5 | 1 For me the result in WinNonlin in Helmut's post is more interesting! Seems they (far side) have validated it against Helmut's Excel version. That adds much credit to the trustworthyness in WinNonlin results of iterative REML analyses of replicate cross-over designs with factor analytic decomposition of the between covariance matrix. Really! Regards, Detlew Helmut ★★★ Vienna, Austria, 2009-01-22 02:49 (4469 d 03:26 ago) @ Ohlbe Posting: # 3101 Views: 21,163 ## Do not trust in computers Dear Ohlbe, » Does Microsoft also call this a "feature", rather than a bug ? If you boil this page down, bug. Although they don’t confess that pre-2003 was buggy – they use some kind of Newspeak. They give also this formula SQRT((SUMSQ(values)-(SUM(values)^2)/COUNT(values))/(COUNT(values)-1)) as ‘at least a good approximation to the value of STDEV as found by earlier versions of Excel.’ There was a major change between v2003 and its predecessors. See this overview in Microsoft’s Knowledge Base. Dif-tor heh smusma 🖖 Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes Ohlbe ★★★ France, 2009-01-22 10:33 (4468 d 19:42 ago) @ Helmut Posting: # 3107 Views: 21,067 ## Do not trust in computers Dear Helmut, » they use some kind of Newspeak. Right, it's not a bug, just a "numeric inaccuracy" Regards Ohlbe Ohlbe ★★★ France, 2009-01-22 22:06 (4468 d 08:09 ago) @ Helmut Posting: # 3110 Views: 21,014 ## Even my calculator... Dear Helmut, My good'ol' long trusted old-fashioned calculator is even worse when using the built-in SD function... OK with 9001, 9002, 9003, but I get an SD of 2 with 90001, 90002, 90003... With longer figures I get either an error code, or a zero. Regards Ohlbe Helmut ★★★ Vienna, Austria, 2009-01-22 23:26 (4468 d 06:49 ago) @ Ohlbe Posting: # 3111 Views: 20,988 ## Even my calculator... Dear Ohlbe, great fun! Like ElMaestro said: “Rubbish in, rubbish out.” In bioanalytics we never see such small CVs, so it doesn’t matter… M$ improves. The calculator of Windows Vista SP1 gives the correct answer even if close to the numeric resolution (90000000000000000000000000000001,…).

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Denmark,
2009-01-23 12:12
(4467 d 18:02 ago)

(edited by ElMaestro on 2009-01-23 12:58)
@ Helmut
Posting: # 3115
Views: 20,929

## Even my calculator...

Haha,

Ohlbe and HS you seem to be having great fun here.
Let me add, I know of a famous, fantastic, brilliant and heavenly piece of software for 2,2,2-BE calculation in which the programmer circumvented problems of this type by scaling all input data* before the model fitting and ANOVA takes place.

There is one potential drawback of doing that, however: In a model fit you need differentials of the dF/dX type, possibly also double differentials. In order to do that the value of the delta (d above) should be chosen VERY wisely. It should in theory be infinitely small, but computers of course cannot do that, so we try e.g. 10e-4 or 10e-6 or whatever. But if the scaling causes the numbers in the input dataset to be of a magnitude close to the delta then everything could be screwed up. The pros and hardliners use Al Gore Rythms with variable delta's in their matrices and differentials. The programmer behind the software mentioned above didn't do that kind of fancy stuff, mainly because of the log operation.
Would still perhaps be interesting to see if a BE-script in R or SAS can be challenged by datasets with very large or very small numbers (perhaps even worse: When the range of values in the input dataset is narrow). Since I don't have SAS I cannot do it all, but I might try a few R challenges one of these days.

EM.

*: All data divided by the highest data entry in the entire dataset. All values then between 0 and 1.
Ohlbe
★★★

France,
2009-01-24 19:44
(4466 d 10:30 ago)

@ ElMaestro
Posting: # 3120
Views: 20,900

## Even my calculator...

» I know of a famous, fantastic, brilliant and heavenly piece of software for 2,2,2-BE calculation...

It reminds me of another fantastic, brilliant and heavenly piece of software developed by the same programmer, which worked beautifully on his computer, but bugged on mine with the very same set of data...

Conclusion: software or spreadsheet validation is only valid under a specific environment, and needs to be repeated if you change for another OS, or another version of the same OS !
Consequence: if a provider sells you a piece of software as "validated", just forget what he said, unless he also sells you the same computer and environment he used for his validation.

Regards
Ohlbe

Regards
Ohlbe
martin
★★

Austria,
2009-01-21 12:16
(4469 d 17:59 ago)

@ Helmut
Posting: # 3084
Views: 21,131

## data exchange format

dear all !

a note on data exchange formats. in general data are used as they appear on the page but with electronic flies the situation is less obvious. some electronic file format is ambiguous concerning correct interpretation and presentation of the file contents. in MS excel, numbers may rounded internally (for example you see 2.35 but the correct number is 2.354324).

additionally, the file content is unclear and may differ from one computer system to other (or between different versions of MS excel). there is also the potential risk of macro viruses with MS excel.

for this reasons, file formats that do not suffer from these aliments should be used. I personally prefer to use CSV (ASCII) files which is an open format and you will be able to read this file error-free in 10 or more years - which may not be the case with the current or older excel (MS proprietary) format.

Best regards

Martin
Helmut
★★★

Vienna, Austria,
2009-01-21 12:31
(4469 d 17:44 ago)

@ martin
Posting: # 3085
Views: 21,137

## Raw data (paper vs. electronic)

Dear Martin!

» […] in general data are used as they appear on the page but with electronic flies the situation is less obvious. some electronic file format is ambiguous concerning correct interpretation and presentation of the file contents. in MS excel, numbers may rounded internally (for example you see 2.35 but the correct number is 2.354324).

Electronic flies?

See one of my lectures (slides 65-66) concerning rounding. If results have to be recalculated from printouts, results may differ if the original evaluation was based on full precision data - which may lead to avoidable discussions…

» I personally prefer to use CSV (ASCII) files which is an open format […]

Wonderful! This is exactly what I want from my clients. Additionally an MD5-checksum should be used to maintain data integrity. If data-files are attached to an e-mail, the checksum should be sent in a separate e-mail.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
SDavis
★★

UK,
2009-01-22 11:56
(4468 d 18:18 ago)

@ Helmut
Posting: # 3108
Views: 21,282

Hi Helmut,

» WinNonlin 5.2.1
»   STDEV(90000001;90000002;90000003) = 0

Just saw your post on this - I managed to replicate it in a workbook using a cell formula, however this is not how I'd recommend people perform such data manipulations in WinNonlin.

Performing the same calculation using either Descriptive Stats or the summary stats within Table Wizard will give you the correct answer i.e. 1. (as indeed will your worked example of calculating it without using the SD function).

I would recommend any WinNonlin users to use the Tools functions I described above since the 'grid' calculations come from, I think FormulaOne, which is a third party tool we use that seems to be susceptible to this error. Incidentally I don't think this should be a problem with Version 6 since it's not using this data grid anymore.

Cheers,
Simon

Simon
Senior Scientific Trainer, Certara™
https://www.certarauniversity.com/dashboard
https://support.certara.com/forums/
Helmut
★★★

Vienna, Austria,
2009-01-21 01:02
(4470 d 05:12 ago)

@ Ohlbe
Posting: # 3079
Views: 21,326

## Why not use R?

Dear Ohlbe!

» […] using […] R to calculate the mean of 6 values would be like using a sledgehammer to kill a mosquito, and would seriously reduce the number of possible users.

Hhm, the only hurdle I see are the sometimes strange import routines. For R novices there's a nice GUI, the R Commander. Import from a text-file, the clipboard, an URL, SPSS, Minitab, Stata, Excel, Access, or dBase is just a few mouse-clicks away. You may perform fairly sophisticated statistics right there (definitely far beyond Excel's capabilities already), or simply type at the command line
sd(c(90000001, 90000002, 90000003)) [1] 1 sd(c(0.90000001, 0.90000002, 0.90000003)) [1] 1e-08
That's all - not a sledgehammer to kill a mosquito, but just a tiny pin.

See here concerning validation.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Denmark,
2009-01-22 19:32
(4468 d 10:43 ago)

@ Helmut
Posting: # 3109
Views: 23,183

## Why not use R?

Garbage in, garbage out.
It's very very simple.

EM.

Pass or fail!
ElMaestro
ElMaestro
★★★

Denmark,
2009-01-23 22:45
(4467 d 07:30 ago)

@ Helmut
Posting: # 3119
Views: 20,932

## Why not use R?

Dear Helmut,

» That's all - not a sledgehammer to kill a mosquito, but just a tiny pin.

Shall we just say the R commander is a nice tool but the mozzies are still being given a thorough beating in the background by a sledge hammer with bells and whistles when the R commander is running.

EM.
Aceto81
★

Belgium,
2009-01-26 16:15
(4464 d 14:00 ago)

@ ElMaestro
Posting: # 3122
Views: 20,871

## Why not use R?

» Shall we just say the R commander is a nice tool but the mozzies are still being given a thorough beating in the background by a sledge hammer with bells and whistles when the R commander is running.

But still I would like to use the sledge hammer when it's free instead of paying a lot of money for the "right" tool.

Ace
d_labes
★★★

Berlin, Germany,
2009-01-26 16:35
(4464 d 13:40 ago)

@ Aceto81
Posting: # 3124
Views: 20,821

## Why not use R?

Dear Ace

» But still I would like to use the sledge hammer when it's free ...

(east-frisian helmet plate, to protect against hostile sledge hammers )

Regards,

Detlew