## Outliers - yes, but how? [BE/BA News]

Dear D. Labes!

» This quote, or is it your own poetry, …

I borrowed it from a personal discussion page at Wikipedia.

» … is really google resistant. First time that I noticed the Net is not omniscient!

Well, personal discussion pages of WP are blocked from being indexed by Google. I have blocked some pages (i.e., the download section and the latest posts) from Google as well. Requires two lines in the domain’s robots.txt
User-agent: * Disallow: /foo/
… and adding rel="nofollow" to respective links within other pages (the same is done here; look at the HTML code).

» » … Did you have a look at box plots and/or QQ-plots of residuals? What about subjects 45 and 52 (studentized residuals outside ±1.96 and outside 3×IQR)?
»
» Which residuals did you analyse? Which model? Please enlighten me.

I used EMA’s crippled model (test removed), Sequence+Subject(Sequence)+Period and threw away one period’s residuals (same values, but different in signs to the respective other one).

» What do you think: which type of boxplot should we apply? Best for our sponsors would be a simple boxplot with whiskers up to minimum/maximum not showing any 'outlier' .

I have chosen ±3×IQR following the convention (!) that values within 1.5-3×IQR are ‘mild’ outliers and outside 3×IQR are ‘severe’ outliers. I’m exploring full replicate data-sets right now – outliers almost in all of them. Don’t know whether residuals make any sense at all (see also this post: period ratios instead?). I would end up with different numbers of outliers, depending on the method of calculation of the IQR (see R-manual). Period ratios (second / first administration):
Type 3: SAS according to R-doc - 2 outliers (#46: 3.524, #45: 26.08)
Type 5: or is this SAS? - 2 outliers
Type 6: Minitab, SPSS, Phoenix/WinNonlin - 2 outliers
Type 7: default in S, R - 3 outliers (as above + #13: 3.349)

                               CVWR    Scaled  AR      width Full data set:                 47.0  71.23 - 140.40    69.17 #45, #52 excluded (residuals): 32.2  78.79 - 126.93    48.14 #45, #46 excluded (ratios):    35.3  77.08 - 129.73    52.65

Don’t know what to do. Suggestions?
Maybe Q&A is an abbreviation for Questions and Ambiguities.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes