Software validation [R for BE/BA]

posted by Helmut Homepage – Vienna, Austria, 2009-09-28 15:21  – Posting: # 4260
Hi ElMaestro!

» Try to put yourself in the shoes of the software writer.

Oh, I took these boots on many years ago. And they got muddy almost immediatelly. I tried to get a numerical approximation of the cdf of the t-distribution which lead me to the gamma-function and headaches. Finally I ended up with numerical approximations.[1,2]

» […] whatever the algo does it is not exact. It is approximate, but how good that approximation actually is is unknown.


» […] we have absolutely no way of telling which one is the right one until some clever guy works out the integrals exactly.

Definitely not me. I’m getting more stupid every day.

» I'll buy you a Mozartkugel if you do it.

THX, no needs.

» Decompile your recent acquisition and check if they one way or another use the ASA243.

I guess the SW was built using the Borland C++ compiler. You know that reverse engineering of software is a breach of the license? :-D
BTW, according to the manual the ‘Illinois method’ (Algorithm AS 184, FORTRAN source) is used. I assume AS 243[3] is ‘better’ than AS 184.[4] But to which extent? Is it my job to go through pros and cons of algorithms?

» If they do, why validate?

First, different algos are used. Second, I have to validate software - or not?
At the product's site I found a nice statement:

These sets of solutions were reviewed by Janet Elashoff, who checked for consistency, face validity, and for computational accuracy against other sources.

Face validity?

What I actually wanted to do was to check my R code (which is also part of bear) with a piece of SW of high reputation. Checking the sample size I wasn’t satisfied (only integers), therefore I opted to do it the other way 'round (power). It made me angry that there is no (easy) way to obtain more than 4 significant digits. By this external validation is effectively prevented.
Another example from the dirt track of SW validation: Diletti et al.[5] published exact samples size tables for BE ranges of 0.9-11 and 0.7–1.43. When trying to validate my code against these tables I found small discrepancies (only at low CVs and close to the acceptance range). By trial-and-error I got the solution: I could reproduce tables only if setting the upper acceptance limit not to the reciprocal of the lower AL (0.9-1, 0.7-1) but to exactly 1.1111 or 1.4286…
BTW, nitpicking Diletti’s Table 1[6] (power 70%, T/R=1, CV=7.5%) and nQuery 7 give n=4, whilst my R code, StudySize 2.0.1, and FARTSSIE 1.6 come up with n=6. Now what? Go with a democratic vote of 3:2 for 6?

