yjlee168
★★★
avatar
Homepage
Kaohsiung, Taiwan,
2008-12-10 13:51
(5587 d 04:19 ago)

(edited by yjlee168 on 2008-12-11 11:17)
Posting: # 2903
Views: 14,090
 

 Preview: outlier detection with bear v2.1.0 [🇷 for BE/BA]

dear all,

We have built the outliers detection function with bear for BE study. As we released bear, this function (intra-subject residuals) has been suggested by some experts of this Forum. Here are some outputs generated from bear as preview before we officially release it within next few days. Before that, we will need your comments about this. Many thanks.
...

               Intra-subject and inter-subject residuals                 
--------------------------------------------------------------------------
   subj   Obs   Exp  Intra Stud_Intra  Inter Stud_Inter
8     1 7.398 7.491 -0.093     -1.126  0.138      0.724
1     2 7.300 7.409 -0.109     -1.318  0.098      0.516
9     3 7.637 7.622  0.015      0.179  0.400      2.095
2     4 7.225 7.312 -0.086     -1.044 -0.097     -0.506
10    5 7.233 7.353 -0.119     -1.447 -0.138     -0.725
3     6 7.471 7.400  0.071      0.855  0.081      0.422
11    7 7.404 7.442 -0.037     -0.454  0.040      0.207
4     8 7.570 7.480  0.090      1.096  0.239      1.252
12    9 7.473 7.446  0.027      0.322  0.048      0.251
5    10 7.236 7.270 -0.034     -0.413 -0.180     -0.945
13   11 7.428 7.289  0.139      1.682 -0.266     -1.393
6    12 7.341 7.236  0.105      1.275 -0.249     -1.302
14   13 7.381 7.311  0.070      0.843 -0.221     -1.159
7    14 7.377 7.414 -0.037     -0.451  0.108      0.563
------------------------------------------------
Obs: Observed lnCmax
Exp: Expected lnCmax
Intra: Intra-subject residuals
Stud_Intra: Studentized intra-subject residuals
Inter: Inter-subject residuals
Stud_Inter: Studentized inter-subject residuals
-------------------------------------------------------------------------
...
         Test results for normality assumption  (Shapiro-Wilk)             
--------------------------------------------------------------------------
             Parameter    Test P_value
1    lnCmax_Stud_Intra 0.93758  0.3882
2    lnCmax_Stud_Inter 0.94438  0.4773
3   lnAUC0t_Stud_Intra 0.87462  0.0488
4   lnAUC0t_Stud_Inter 0.90178  0.1197
5 lnAUC0INF_Stud_Intra 0.86833  0.0398
6 lnAUC0INF_Stud_Inter 0.91282  0.1731

-------------------------------------------------
Stud_Intra: studentized intra-subject residuals
Stud_Inter: studentized intra-subject residuals
Shapiro-Wilk: the normality of the studentized intra-subject residuals and
              the studentized inter-residuals was examined using the test 
              of Shapiro-Wilk.                                             
-------------------------------------------------------------------------


             Test results for normality assumption  (Pearson)             
--------------------------------------------------------------------------
  Parameter     Test P_value
1    lnCmax -0.19043  0.5143
2   lnAUC0t -0.39188  0.1658
3 lnAUC0INF -0.39646  0.1605

-------------------------------------------------
Pearson: Pearson's correlation coefficient
-------------------------------------------------------------------------


             Test results for normality assumption  (Spearman)             
--------------------------------------------------------------------------
  Parameter     Test P_value
1    lnCmax -0.28352  0.3253
2   lnAUC0t -0.23956  0.4086
3 lnAUC0INF -0.26154  0.3656

-------------------------------------------------
Spearman: Spearman's rank correlation coefficient
-------------------------------------------------------------------------



               Test results for equality of variabilities                 
--------------------------------------------------------------------------
                   Method     Test P_value
1          lnCmax_Pearson -0.18320  0.5307
2    lnCmax_Pitman_Morgan  0.41674  0.5307
3         lnCmax_Spearman -0.20879  0.4731
4         lnAUC0t_Pearson -0.11204  0.7029
5   lnAUC0t_Pitman_Morgan  0.00011  0.9917
6        lnAUC0t_Spearman -0.16044  0.5838
7        lnAUC0INFPearson -0.09948  0.7351
8 lnAUC0INF_Pitman_Morgan  0.00061  0.9806
9      lnAUC0INF_Spearman -0.15604  0.5944



   Point and interval estimation of inter- and intra-subject variability 
--------------------------------------------------------------------------
              Parameter Point_Estimate CI95_lower CI95_upper
1          lnCmax_intra          0.016      0.008      0.043
2          lnCmax_inter          0.003     -0.008      0.022
3     lnCmax_intraclass          0.144     -0.421      0.628
4           lnCmax_prob        0.31164          -          -
5         lnAUC0t_intra          0.032      0.017      0.088
6         lnAUC0t_inter         -0.008     -0.023      0.009
7    lnAUC0t_intraclass         -0.316     -0.726      0.261
8          lnAUC0t_prob        0.86423          -          -
9       lnAUC0INF_intra          0.031      0.016      0.084
10      lnAUC0INF_inter         -0.007     -0.022      0.010
11 lnAUC0INF_intraclass         -0.303     -0.719      0.274
12       lnAUC0INF_prob        0.85363          -          -

-------------------------------------------------
intra: intra-subject variability
inter: inter-subject variability
intraclass: intraclass correlation
prob: the probability for obtaining a negative estimate of inter-subject
      variability
CI95: 95% confidence interval
-------------------------------------------------------------------------

We only show the part of Cmax here. The AUC0-t and AUC0-inf will also display when running bear. This will be included in the output file of 'ANOVA_stat.txt'. also we have some normal Q-Q plots for outliers detection purposes.
[image]
[image]
We built these functions for bear mostly based on the textbook of Chow SH, Liu, JP. "Design and Analysis of Bioavailability and Bioequivalence Studies", Third Edition (Chapman & Hall/Crc Biostatistics Series).

All the best,
Hsin-ya Lee & Yung-jin Lee
College of Pharmacy,
Kaohsiung Medical University,
Kaohsiung, Taiwan 807
http://pkpd.kmu.edu.tw/bear
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2008-12-11 15:01
(5586 d 03:08 ago)

@ yjlee168
Posting: # 2906
Views: 12,067
 

 Preview: outlier detection with bear v2.1.0

Dear Hsin-ya & Yung-jin,

Wonderful!

Some remarks:
Which dataset are you using (I want to recalculate your results)?
It maybe nice to flag values (i.e., give the subject's no.) in the QQ-plots which are outside ±2·sigma.
Another suggestion would be a plot of ln(predicted) vs. studentized residuals. Such a plot allows the distinction between concordant outliers (T/R similar to the majority of subjects, but both T and R lower or higher than normal = parallel shift in plot) and discordant outliers (T or R lower or higher; suspected formulation failure or subject-by-formulation interaction). For an example see here.

P.S.
In your example pdf for v2.0.1 the labels for time and conc are mixed up (pages 30-32). A suggestion would be to scale both spaghetti-plots (pages 30-31) to the maximum concentration observed in the entire dataset (not within formulations). Then it's easier to compare both formulations visually.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
yjlee168
★★★
avatar
Homepage
Kaohsiung, Taiwan,
2008-12-11 19:38
(5585 d 22:32 ago)

@ Helmut
Posting: # 2907
Views: 11,936
 

 Preview: outlier detection with bear v2.1.0

dear Helmut,

Thanks you for your encouragement.

❝ Which dataset are you using (I want to recalculate your results)?


The data we used to test "outlier function" was the one built-in in bear. So it should be easy to find out from previous version of bear (e.g., v2.0.1..) Plz wait just a few days after we release the new version.

❝ It maybe nice to flag values (i.e., give the subject's no.) in the QQ-plots which are outside +2sigma.


Exactly. We have started adding this (subject labellings) into the normal probability plots (Q-Q plots) before I posted the previous message in this Forum. Indeed it will be easier to identify outliers if it can be done so.

❝ Another suggestion would be a plot of ln(predicted) vs. studentized residuals. Such a plot allows the distinction between (...)


What a fantastic idea :ok:! In your slides, you even provide much more information about outlier detection than the textbook of Chow SH, Liu JP. "Design and Analysis of Bioavailability and Bioequivalence Studies", 3rd ed. (Chapman & Hall/Crc Biostatistics Series). Looks like that it's the great presentation :clap: you just made in India. Lucky audiences in that meeting. Just no budget to go this time ;-).

❝ (...)v2.0.1 the labels for time and conc are mixed up (pages 30-32). A suggestion would be to scale both spaghetti-plots (pages 30-31)(...)


Oops! it's funny mixed-up with labeling of x, y-axis. Yes, we will fix these soon. We want to thank you again for your valuable comments.

All the best,
-- Yung-jin Lee
bear v2.9.1:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan https://www.pkpd168.com/bear
Download link (updated) -> here
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2008-12-11 22:28
(5585 d 19:42 ago)

@ yjlee168
Posting: # 2908
Views: 12,210
 

 Preview: outlier detection with bear v2.1.0

Dear Yung-jin!

❝ ❝ Which dataset are you using (I want to recalculate your results)?

❝ The data we used to test "outlier function" was the one built-in in bear. So it should be easy to find out from previous version of bear (e.g., v2.0.1..)


OK, checked lnCmax and could reproduce your model estimates and residuals in an old program I once wrote in STATISTICA.
Running Shapiro-Wilk in R gives
        Shapiro-Wilk normality test

data:  intra
W = 0.9333, p-value = 0.3389

in agreement with NCSS (W 0.9332635, p 0.338826), but different from your results (0.93758 0.3882)?!

BTW for log-transformed Cmax data the output reads:
Intra_subj. CV=100*sqrt(MSResidual)= 12.60176 %
Inter_subj. CV=100*sqrt((MSSubject(seq)-MSResidual)/2)= 7.554316 %

Acc. to D Hauschke et al.; Int J Clin Pharmacol Ther 32/7, 376-378 (1999) in the multiplicative model it should be
Intra_subj. CV=100*sqrt(exp(MSResidual)-1)= 12.65195 %
Inter_subj. CV=100*sqrt((exp((MSSubject(seq)-MSResidual)/2)-1))= 5.19586 %


❝ ❝ Another suggestion would be a plot of ln(predicted) vs. studentized residuals. [...]

❝ What a fantastic idea!


Oh, this is not my invention; I'm just a dwarf standing on the shoulders of giants.

❝ In your slides, you even provide much more information about outlier detection...


Do I?

❝ ...than the textbook of Chow SH, Liu JP. "Design and Analysis of Bioavailability and Bioequivalence Studies", 3rd ed. (Chapman & Hall/Crc Biostatistics Series).


Hhm, I have it on my desk just for a couple of days now. Disappointing that some typos are still uncorrected (Table 8.2.1, model value of subject 18 which was correct with 4.581 in ed. 1 is given with 4.851 in eds. 2/3). Another great method to check influential observations is Cook's distance, whereas personally I find the Tukey sum–difference plot and the Equal variance plot less convincing (especially in detecting outliers). See S-Plus-code in Chaper 7 of Millard and Krause (2001) by B Pikounis, TE Bradstreet and SP Millard here.

❝ [...] the great presentation you just made in India. Lucky audiences in that meeting.


Yeah, it was great fun!

❝ Just no budget to go this time.

Don't bother - it was a basic workshop (mainly); you would have been bored. :sleeping:

❝ We want to thank you again for your valuable comments.


Welcome & thanks for your work and keeping the spirit of
[image]
high. :ok:

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
yjlee168
★★★
avatar
Homepage
Kaohsiung, Taiwan,
2008-12-14 01:49
(5583 d 16:20 ago)

@ Helmut
Posting: # 2922
Views: 11,856
 

 Preview: outlier detection with bear v2.1.0

dear Helmut,

We used the data obtained from ch. 3 and ch. 6 of the textbook of Chow SH, Liu JP. "Design and Analysis of Bioavailability and Bioequivalence Studies", 3rd ed., Chapman & Hall, 2008, to test this part. The textbook provided SAS outputs for Shapiro-Wilk normality test which were compared with bear's outputs. We got totally the same results as the SAS outputs shown in the textbbok. That was the way we used to validate this part at the beginning. The results posted in the previous message were using the built-in dataset. However, we are very happy to check with this again as soon as possible, and will respond here. Thank you for your testing.

❝ [...]


        Shapiro-Wilk normality test


data:  intra

W = 0.9333, p-value = 0.3389

❝ in agreement with NCSS (W 0.9332635, p 0.338826), but different from your results (0.93758 0.3882)?!


All the best,
-- Yung-jin Lee
bear v2.9.1:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan https://www.pkpd168.com/bear
Download link (updated) -> here
d_labes
★★★

Berlin, Germany,
2008-12-15 12:52
(5582 d 05:17 ago)

@ Helmut
Posting: # 2923
Views: 11,746
 

 Preview: outlier detection with bear v2.1.0

Dear Helmut, dear Yung-jin,

❝ OK, checked lnCmax and could reproduce your model estimates and residuals in an old program I once wrote in STATISTICA.

❝ Running Shapiro-Wilk in R gives

        Shapiro-Wilk normality test


data:  intra

W = 0.9333, p-value = 0.3389

❝ in agreement with NCSS (W 0.9332635, p 0.338826), but different from your results (0.93758 0.3882)?!


sorry, but I have the exactly the same results as Yung-jin (Bear) using the 'power to know' on the log-transformed Cmax data, intra-individual residuals from Proc GLM.

Further I cannot confirm Helmuts results in NCSS. Using the intra-individual residuals above (3 decimals), which Helmut has confirmed as correct, my NCSS (NCSS 2004) gives:
W=0.9371516  p=0.3830442

Do I miss something?

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2008-12-15 14:28
(5582 d 03:41 ago)

@ d_labes
Posting: # 2924
Views: 11,820
 

 Preview: outlier detection with bear v2.1.0

Dear D. Labes, dear Yung-jin,

very strange.

Below the intra-subject residuals to 5 decimals and the R-code:
intra <- c(-1.12602,+1.31764,+0.17871,+1.04384,-1.44654,-0.85471,-0.45394,
           -1.09608,+0.32237,+0.41330,+1.68203,-1.27489,+0.84339,+0.45090)
shapiro.test(intra)

        Shapiro-Wilk normality test

data:  intra
W = 0.9333, p-value = 0.3389

Fidling around with the format options gives
W = 0.933267, p-value = 0.3388638

My version of NCSS is 2001 (December 2, 2002):
Normality Test Section of intra
                Test       Prob      Decision
Test Name       Value      Level     (5%)
Shapiro-Wilk W  0.9332672  0.338867  Can't reject normality


What's going on here? :confused:

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2008-12-15 15:44
(5582 d 02:25 ago)

@ Helmut
Posting: # 2925
Views: 11,792
 

 residuals of period 1

Dear Helmut,

❝ Below the intra-subject residuals to 5 decimals and the

R-code:

❝ intra <-

❝ c(-1.12602,+1.31764,+0.17871,+1.04384,-1.44654,-0.85471,-0.45394,       

❝ -1.09608,+0.32237,+0.41330,+1.68203,-1.27489,+0.84339,+0.45090)


Here are mine: intra-indiv. residuals / studentized residuals
subject  residuals     Student
    1     -0.09289    -1.12602
    2     -0.10870    -1.31764
    3      0.01474     0.17871
    4     -0.08611    -1.04384
    5     -0.11934    -1.44654
    6      0.07051     0.85471
    7     -0.03745    -0.45394
    8      0.09042     1.09608
    9      0.02660     0.32237
   10     -0.03410    -0.41330
   11      0.13876     1.68203
   12      0.10518     1.27489
   13      0.06958     0.84339
   14     -0.03720    -0.45090

Red: different in sign to yours.
Mine are identical to Yung-jin (see above).

Shapiro-Wilk and other normality tests:
                             The UNIVARIATE Procedure
Variable:  residuals
                               Tests for Normality

Test                  --Statistic---    -----p Value------

Shapiro-Wilk          W     0.937582    Pr < W      0.3882
Kolmogorov-Smirnov    D     0.154782    Pr > D     >0.1500
Cramer-von Mises      W-Sq  0.053752    Pr > W-Sq  >0.2500
Anderson-Darling      A-Sq  0.341562    Pr > A-Sq  >0.2500

                             The UNIVARIATE Procedure
Variable:  Student
                               Tests for Normality

Test                  --Statistic---    -----p Value------

Shapiro-Wilk          W     0.937582    Pr < W      0.3882
Kolmogorov-Smirnov    D     0.154782    Pr > D     >0.1500
Cramer-von Mises      W-Sq  0.053752    Pr > W-Sq  >0.2500
Anderson-Darling      A-Sq  0.341562    Pr > A-Sq  >0.2500


For using period 1 (or period 2) residuals see f.i.
Chen et.al.
A note on ANOVA assumptions and robust analysis for a cross-over study
Stat. Med. 2002, 21, p1377-1386

BTW: Using your residuals (Studentized) I got your results.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2008-12-16 05:02
(5581 d 13:08 ago)

@ d_labes
Posting: # 2927
Views: 11,825
 

 residuals of period 1!!

Dear D. Labes,

thanks a lot forcing me to polish up my rusty and limited knowledge of STATISTICA's scripting language! ;-)

If I remember it correctly I struggled hours (days?) to reproduce Chow's & Liu's tables 8.2.1/8.2.3 (Clayton's famous data) from their first edition (1992). Although they mentioned period 1 throughout the text and in the table's headings, they used modeled period 2 values for subjects 10-18 (in sequence RT). After a lot of trial and error I gave up in order to continue with their examples.
This explains the wrong sign. :angry:

Correcting the calculation of sequence 1 everything is fine:
resid
W 0.9375813, p 0.3882104
stud
W 0.9375817, p 0.3882145


Other stuff in NCSS (I used values rounded to 6 significant digits):
resid
                    Test        Prob
Test Name           Value       Level
Shapiro-Wilk W      0.9375815   0.388212
Anderson-Darling    0.3454837   0.484160   
Martinez-Iglewicz   1.003992    >10%
Kolmogorov-Smirnov  0.1547813   >10%
stud
                    Test        Prob
Test Name           Value       Level
Shapiro-Wilk W      0.9375818   0.388217
Anderson-Darling    0.3454824   0.484163   
Martinez-Iglewicz   1.003992    >10%
Kolmogorov-Smirnov  0.1547828   >10%


❝ For using period 1 (or period 2) residuals see f.i.

Chen et.al.

❝ A note on ANOVA assumptions and robust analysis for a cross-over study

❝ Stat. Med. 2002, 21, p1377-1386


Thanks, I didn't know that one.

❝ BTW: Using your residuals (Studentized) I got your results.


Sooner or later we will end up with true 'validation' of code - or our sloppy use of it... ;-)

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2008-12-16 09:34
(5581 d 08:36 ago)

@ Helmut
Posting: # 2929
Views: 11,673
 

 residuals of period 1!!

Dear Helmut,

❝ Sooner or later we will end up with true 'validation' of code - or our sloppy use of it... ;-)


To quote a valued member of this forum:
Never trust in any piece of software you haven't written yourself (and even then you should be cautious...) :-D.

Regards,

Detlew
yjlee168
★★★
avatar
Homepage
Kaohsiung, Taiwan,
2008-12-17 19:28
(5579 d 22:41 ago)

@ d_labes
Posting: # 2934
Views: 11,859
 

 Shapiro-Wilk normality test

dear D. Labes and dear Helmut,

I don't know that D. Labes has presented his finding (SAS runs) here. What we got was exactly the same as his. Regarding the intra- & inter-subject CV, we used to adopt calculations from the Canada Guideline which will be somewhat different from those of your suggestions (Hauschke's paper and also his great textbook). We decide to switch to your methods in the next version. Thank Helmut for your kindly testing with bear. And thanks to "the Power to know", D. Labes, for helping us. :clap:

All the best,
-- Yung-jin Lee
bear v2.9.1:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan https://www.pkpd168.com/bear
Download link (updated) -> here
UA Flag
Activity
 Admin contact
22,957 posts in 4,819 threads, 1,638 registered users;
74 visitors (0 registered, 74 guests [including 9 identified bots]).
Forum time: 18:10 CET (Europe/Vienna)

Nothing shows a lack of mathematical education more
than an overly precise calculation.    Carl Friedrich Gauß

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5