d_labes
★★★

Berlin, Germany,
2010-09-10 14:05
(4948 d 10:03 ago)

Posting: # 5904
Views: 23,990
 

 tmax in case of ties: R vs. R vs. SAS [Nonparametrics]

Dear all!

Ok, this thread could also go into the category "R for BE/BA". But this here is Helmut's favorite ;-).

Despite the fact that non-parametrics will no longer be accepted by the EMA :not really: the world is bigger than Europe and many guidelines still require a non-parametric analysis of tmax (some of them obviously copied from the old EMEA guidance).

It is well known1), that the analysis in case of the classical 2x2 cross-over is done by analysing the period differences (P1-P2) via a Wilcoxon rank sum test with the sequence as grouping factor.

Since in case of tmax there are usually some/many equal values (ties in the non-parametric methods speak) the used software should be able to handle such.
In exploring this within R using the tmax values from the single dose data of Bear I came up with:
# Bear single dose data
data <- c("
 subject sequence      P2     P1
       1       TR      2.0    2.0
       2       RT      1.5    3.0
       3       TR      2.0    2.0
       4       RT      2.0    2.0
       5       TR      3.0    2.0
       6       RT      3.0    2.0
       7       TR      1.5    2.0
       8       RT      3.0    2.0
       9       TR      2.0    3.0
      10       RT      2.0    1.5
      11       TR      1.5    2.0
      12       RT      2.0    3.0
      13       TR      1.5    3.0
      14       RT      3.0    3.0")
tc <- textConnection(data)
PKt  <- read.table(tc, header=TRUE, strip.white=TRUE, as.is=TRUE)
close(tc)

PKt$pdiff <- PKt$P1 - PKt$P2

# to run it again we must remove coin
if ("package:coin" %in% search()) detach("package:coin",unload=TRUE)

library(exactRankTests)

t1 <- wilcox.exact(pdiff ~ sequence, data=PKt, conf.int=TRUE, conf.level=0.9)
print(t1)

This tolds me:
  Package 'exactRankTests' is no longer under development.
  Please consider using package 'coin' instead.


    Exact Wilcoxon rank sum test

data:  pdiff by sequence
W = 18, p-value = 0.4149
alternative hypothesis: true mu is not equal to 0
90 percent confidence interval:
 -1.5  0.5
sample estimates:
difference in location
                 -0.25

BTW: Because of lexical ordering RT vs. TR is tested and this estimates R-T. Thus we have to change the sign and exchange upper and lower bound to get them for T-R in which we are interested.

Since I'm an obedient man (especially if my wife speaks to me :-D) I followed the polite suggestion got after loading package 'exactRankTests' and tried:
library(coin)
PKt$sequence <- as.factor(PKt$sequence) #very important
t2 <- wilcox_test(pdiff ~ sequence, data=PKt, conf.int=TRUE, conf.level=0.9,
                  distribution="exact")
print(t2)

and this told me:
    Exact Wilcoxon Mann-Whitney Rank Sum Test

data:  pdiff by sequence (RT, TR)
Z = -0.8465, p-value = 0.4149
alternative hypothesis: true mu is not equal to 0
90 percent confidence interval:
 -1.5  0.5
sample estimates:
difference in location
                  -0.5


Upps! Different point estimates :confused:.
In other examples (no room to show here) with tied data I have also observed different CIs!

Any body out there who knows whats going on here?
I have searched the R support lists and found that observed by others. But the answers where at least not sufficient if not meaningless.
I have found a stat. course website which states "There is an R function wilcox.exact ... It does not do confidence intervals or point estimates correctly in the presence of ties." but without an explanation.

The new option HL (=Hodges Lehmann estimator) in my "Power to knoff" SAS® 9.2 Proc NPAR1way (code upon request) gives me between other stuff:

            The NPAR1WAY Procedure
...
        Average scores were used for ties.

           Wilcoxon Two-Sample Test

             Statistic (S)               59.0000

             Exact Test
             One-Sided Pr >=  S           0.2075
             Two-Sided Pr >= |S - Mean|   0.4149
...
             Hodges-Lehmann Estimation

             Location Shift    0.5000

                                                Interval        Asymptotic
Type                 90% Confidence Limits      Midpoint    Standard Error

Asymptotic (Moses)     -0.5000      1.5000        0.500            0.6080
Exact                  -0.5000      1.5000        0.500

but this estimates T-R because TR is tested versus RT here.


1)Hauschke et.al.
"A distribution free procedure for the statistical analysis of bioequivalence studies"
Int. J. Clin. Pharmacol. Vol. 28 (2) 1990, 72-78/Vol. 30, Suppl. 1, S37-43

Sorry for that lengthy post. But I intended to give the Rusers among us the possibility to verify my results with cut and paste.

Sorry2: I have totally forgotten that the estimate and the confidence interval calculated as given are for 2*(T-R). Thus the results must be divided by 2! But what I wanted to show remains.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2010-09-10 20:28
(4948 d 03:40 ago)

@ d_labes
Posting: # 5905
Views: 20,705
 

 tmax in case of ties: StatXact, Phoenix, and...

Dear D. Labes!

❝ Ok, this thread could also go into the category "R for BE/BA". But this here is Helmut's favorite ;-).


Right. But if ties are considered, you are stirring up a hornets' nest!

❝ Since in case of tmax there are usually some/many equal values (ties in the non-parametric methods speak) the used software should be able to handle such.


Well, cough…

Summary of your results (T-R) - changed signs if necessary and divided by 2:
program                   PE      90.26% CI
library(exactRankTests)  +0.125  -0.25 +0.75
library(coin)            +0.25   -0.25 +0.75
SAS (asymptotic)         +0.25   -0.25 +0.75
SAS (exact)              +0.25   -0.25 +0.75


My quick and dirty solution:
PKt$pdiff    <- (PKt$P1 - PKt$P2)/2
PKt$sequence <- relevel(PKt$sequence, ref="TR")
PKt$sequence
 [1] TR RT TR RT TR RT TR RT TR RT TR RT TR RT
Levels: TR RT
require(exactRankTests)
 t1 <- wilcox.exact(pdiff ~ sequence, data=PKt, conf.int=TRUE, conf.level=0.9)
print(t1)

        Exact Wilcoxon rank sum test

data:  pdiff by sequence
W = 31, p-value = 0.4149
alternative hypothesis: true mu is not equal to 0
90 percent confidence interval:
 -0.50  0.75
sample estimates:
difference in location
                 0.125

require(coin)
t2 <- wilcox_test(pdiff ~ sequence, data=PKt, conf.int=TRUE, conf.level=0.9,
                  distribution="exact")
print(t2)

        Exact Wilcoxon Mann-Whitney Rank Sum Test

data:  pdiff by sequence (TR, RT)
Z = 0.8465, p-value = 0.4149
alternative hypothesis: true mu is not equal to 0
90 percent confidence interval:
 -0.25  0.75
sample estimates:
difference in location
                  0.25


❝ Any body out there who knows whats going on here?


Hhm; don't use library(exactRankTests)?

Hauschke et al. (1990) don't speak about ties; they refer to Hollander and Wolfe.1 In my edition the word 'tie' is not even mentioned in the subject index… With my mid-1990 implementation in Rocky Mountain BASIC (not corrected for ties) I get:
                          PE      90.26% CI
                         +0.25   -0.25 +0.75


Now for StatXact 6.1. using the coding according to Senn (2002) I get:
                          PE      90.26% CI
asymptotic               +0.50   -0.25 +1.00
exact                    +0.50   -0.25 +1.00

What the heck?

Phoenix 6.1 (asymptotic), according to Koch2 (no ties!):
                          PE      90.26% CI
                         +0.25   -0.25 +0.75


EquivTest/PK (asymptotic):
                          PE      90.26% CI
                         +0.25   -0.25 +0.75


Not a single word about ties in the manuals of Phoenix and EquivTest/PK.
StatXact states:

[…] zero differences are ignored. We assume initially that there are no ties in the observed values of the Di's. We will show subsequently how to adjust for the possibility of ties. […] By suitably adapting the reasoning in Lehmann (1975, pages 181 to 182) so as to accommodate ties one can show that the probability that the interval [λ*, λ*] excludes λ is at most α. This ensures that the Hodges-Lehmann confidence intervals preserve the desired coverage probability.

:confused:

In a more recent textbook3 the common procedure of using mid-ranks for ties is given. According to this reference StatXact adjusts for ties. But I'm not so sure anymore. Open questions:
  • Signs… (sigh!) I've checked the coding in all programs numerous times, but don't know what's going on here. Fishbrain. I guess the reason might be that you are comparing period differences rather than treatment differences. Since we have a balanced case here, it might explain the opposite signs.
  • In my understanding wilcox_test() does not correct for ties. I tried ties.method="mid-ranks" and ties.method="average-scores" and got:
    Warning:
    In independence_test.IndependenceProblem(object, teststat = "scalar",  :
      additional arguments ties.method will be ignored
    :-(
  • Apart from the signs, the numerical values are identical with the exception of StatXact's results and the PE in R's exactRankTests. We have four out of 14 cases with a zero difference. If I manually exclude these cases from the dataset I get exactly the same results as for the full dataset.
  • If I exclude subjects with a zero difference (1, 3, 104, 14) I get StatXact's results in all (!) other programs… So what do we have here? One program (maybe!) adjusts for ties, but excludes zero differences.

❝ Sorry for that lengthy post.


So am I. You gave me a nice excuse not to spend the afternoon struggling with my income tax return.


  1. Holander M, Wolfe DA. Nonparametric Statistical Methods. New York: Wiley; 1973.
  2. Koch GG. The Use of Non-Parametric Methods in the Statistical Analysis of the Two-Period Change-Over Design. Biometrics. 1972: 28; 578–85. PMID 4556704.
  3. Sprent P, Smeeton NC. Applied Nonparametric Statistical Methods. Boca Raton: Chapman & Hall/CRC; 4th ed. 2007.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2010-09-13 15:21
(4945 d 08:47 ago)

@ Helmut
Posting: # 5906
Views: 20,382
 

 Ties, no ties, ties, no ties ...

Dear Helmut!

Thanks for your very elaborate answer!

❝ Right. But if ties are considered, you are stirring up a hornets' nest!


I see :cool:.

❝ My quick and dirty solution:

PKt$pdiff    <- (PKt$P1 - PKt$P2)/2

PKt$sequence <- relevel(PKt$sequence, ref="TR")

...


Interesting: "relevel". Was not aware of that handy tool.
But astonishing the result: A change in order gives a different lower bound!

❝ Hhm; don't use library(exactRankTests)?


Seems a good recommendation. But it is claimed frequently that Rusers should use it in case of ties. Moreover coin has a much bigger footprint.

❝ ... they refer to Hollander and Wolfe. In my edition the word 'tie' is not even mentioned in the subject index...


Very strange. Everyone writing about rank methods is citing this.

❝ With my mid-1990 implementation in Rocky Mountain BASIC (not

❝ corrected for ties) I get:

   PE      90.26% CI

  0.25    -0.25 +0.75


Not 'correcting' for ties to obtain the CI is exactly what Charles Geyer recommends. With respect to the Wilcoxon sign rank test he states:
"First, neither ties nor zeros should make any difference in calculating point estimators or confidence intervals.
  • The point estimate is the median of the Walsh averages ...
  • The end points of the confidence interval are k in from each end of the sorted Walsh averages ...
  • Ties and zeros affect only hypothesis tests.
This is a bit of programmer brain damage (PBD) in the implementation of the wilcox.test and wilcox.exact functions. They change the way they calculate point estimates and confidence intervals when there are ties or zeros. But they shouldn't"
.
And this may be transferable to the rank-sum test :ponder:.
If one follows his code in the interval section of the Wilcoxon rank sum test one gets your Rocky results.

❝ Now for StatXact 6.1. using the coding according to Senn (2002) ...


What exactly did you here? Senn's Basic estimator? Then a test of sequence group differences would give an estimate of the period effect. Or am I wrong? See the expected period differences below.

❝ ... Open questions:

❝ • Signs... (Sigh!) I've checked the coding in all programs numerous times,

  but don't know what's going on here. Fishbrain.


I came across with: Sometimes the smaller group is used as the first, sometimes the coding order, sometimes the character order, sometimes the first occurred level.
My [image] uses the rule: smaller group and if both have the same n use the first occurred level as group 1. If this makes any sense is left to you.

❝ I guess the reason might be that you are comparing period differences rather than treatment differences.


The reason for using the period differences rather then the treatment diffs comes from the paper of Hauschke. It is based on the expected values
sequence   P1-P2
  TR       µ(T-R) + p1-p2
  RT      -µ(T-R) + p1-p2

Thus a test of the differences by sequence group gets rid of the period effects. This is not the case with the treatment diffs.

❝ ... I tried ties.method="mid-ranks" and ties.method="average-scores" and got:

Warning: In independence_test.IndependenceProblem(object, teststat = "scalar",  :

  additional arguments ties.method will be ignored :-(


I also observed this. Although mentioned on the help page it seems that the ties.method is not available in the wilcoxon_test function and other location tests of package coin. ties.method is only accepted by the function normal_test() which performs the "Exact Normal Quantile (van der Waerden) Test".

❝ If I exclude subjects with a zero difference ...


Called by Charles Geyer "The Zero fudge" :-D.

❝ ... You gave me a nice excuse not to spend the afternoon struggling with my income tax return.


If it is that horrible as here in Germanien I did you the favor very willingly :wink:. But the pity is that the problem with the Finanzamt does not vanish by sitting it out.

Regards,

Detlew
martin
★★  

Austria,
2010-09-14 10:33
(4944 d 13:36 ago)

@ d_labes
Posting: # 5907
Views: 20,266
 

 simulation

dear HS and d_labes!

you may find the following simulation results interesting

rm(list=ls(all=TRUE))
library(coin)
library(exactRankTests)

set.seed(130910)
n <- 10
nsim <- 1E4
cover1 <- 0
cover2 <- 0
grp <- as.factor(c(rep('A', n), rep('B', n)))
for(i in 1:nsim){
    x <- c(round(rnorm(n, mean=0, sd=1),1), round(rnorm(n, mean=0, sd=1),1))
    res1 <- confint(wilcox_test(x~grp, distribution='exact', conf.int=TRUE,
    conf.level=0.95))$conf.int
    res2 <- wilcox.exact(x~grp, conf.int=TRUE, conf.level=0.95)$conf.int
      if(res1[1] < 0 & res1[2] > 0){cover1 <- cover1 + 1}
      if(res2[1] < 0 & res2[2] > 0){cover2 <- cover2 + 1}
  }
print(cover1/nsim)
[1] 0.937
print(cover2/nsim)
[1] 0.9505

prop.test(x=c(cover1, cover2), n=c(nsim, nsim))

        2-sample test for equality of proportions with continuity correction

data:  c(cover1, cover2) out of c(nsim, nsim)
X-squared = 16.9122, df = 1, p-value = 3.915e-05
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.01998361 -0.00701639
sample estimates:
prop 1 prop 2
0.9370 0.9505


best regards
martin
d_labes
★★★

Berlin, Germany,
2010-09-15 12:14
(4943 d 11:54 ago)

@ martin
Posting: # 5909
Views: 20,126
 

 simulants of the world unite

Dear Martin!

❝ you may find the following simulation results interesting

❝ ...

print(cover1/nsim)

[1] 0.937    # coverage of wilcox_test (coin)

print(cover2/nsim)

[1] 0.9505   # coverage of wilcox.exact (exactRankTests)

(comments by me)
Wow! Really interesting. And unbelievable :ponder:.
I was told that the Wilcoxon rank sum test is due to the discrete nature of the permutation distribution function always to some degree conservative (achievable confidence with n1, n2=10 is 0.956743).
And here it comes out that the implementation in the package coin is distinct anti-conservative! And the implementation in wilcox.exact does not show signs of conservativeness.

I have one small doubt. I'm not quite sure but should we not include the CI borders in counting the coverage due to the definition of the CI?
If we do this one gets (only 5000 sims, see BTW below):
[1] 0.9608   # coverage of wilcox_test (coin)
[1] 0.9608   # coverage of wilcox.exact (exactRankTests)

Moreover there seems to be some interaction between both packages. If I simulate without coin I get coverage of wilcox.exact = 0.9688 (including the CI borders in counting coverage).

BTW: Warning to all! The code is really a stress-test for the CPU of your computer (at least for mine). Submit the code and go for one and another beers. Next morning the result is astonishing enough :-D.
Therefore I could answer not until today.

Regards,

Detlew
martin
★★  

Austria,
2010-09-15 16:53
(4943 d 07:15 ago)

@ d_labes
Posting: # 5915
Views: 20,121
 

 simulants of the world unite

dear d_labes!

❝ I have one small doubt. I'm not quite sure but should we not include the CI borders in counting the coverage due to the definition of the CI?


thank you very much for pointing this out; IMHO you are right - this is crucial in the case of ties. here is a little bit faster version of the simulation code (ca. 30 min on my CPU)

date()
[1] "Wed Sep 15 14:16:44 2010"
set.seed(130910)
n <- 10
nsim <- 1E4
cover1 <- 0
cover2 <- 0
grp <- as.factor(c(rep('A', n), rep('B', n)))
for(i in 1:nsim){
    x <- round(rnorm(2*n, mean=0, sd=1),1)
    res1 <- confint(wilcox_test(x~grp, distribution='exact', conf.int=TRUE,
    conf.level=0.95))$conf.int
    res2 <- wilcox.exact(x~grp, conf.int=TRUE, conf.level=0.95)$conf.int
      if(res1[1] <= 0 & res1[2] >= 0){cover1 <- cover1 + 1}
      if(res2[1] <= 0 & res2[2] >= 0){cover2 <- cover2 + 1}
  }
print(cover1/nsim)
[1] 0.9609
print(cover2/nsim)
[1] 0.9682
date()
[1] "Wed Sep 15 14:46:26 2010"


and here I am, for all my lore, the wretched fool I was before (Goethe)

martin
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2010-09-15 18:17
(4943 d 05:51 ago)

@ martin
Posting: # 5916
Views: 20,166
 

 simulants of the world unite

Dear Martin!

❝ here is a little bit faster version of the simulation code (ca. 30 min on my CPU)


Faster machine than mine (double Xeon 2.8GHz, 4GB RAM, 32bit XP Pro, hyperthreading on)...

I would use conf.level=0.90. At the start of the code
ptm <- proc.time()
and after the output
runtime <- proc.time() - ptm
cat("Runtime",runtime[1]/60,"minutes\n")

I got
print(cover1/nsim)
[1] 0.9288
print(cover2/nsim)
[1] 0.9359
Runtime 32.90417 minutes


❝ and here I am, for all my lore, the wretched fool I was before (Goethe)


Yep. For n1=n2=10 I would expect 0.9108...

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2010-09-16 13:19
(4942 d 10:49 ago)

@ Helmut
Posting: # 5917
Views: 20,038
 

 simul ants with no ties

Dear Helmut!

❝ Yep. For n1=n2=10 I would expect 0.9108...


Yeah, but ... without ties.
10 000 sims each with round(,5) in the code and only wilcox.exact():
run 1  0.9096
run 2  0.9076
run 3  0.9125
run 4  0.9104
run 5  0.9138
-------------
mean   0.91078

Each run (10 000 sims) only approx. 100 s!

The mean seems very close to your expectation :cool:.
---[nitpicking]---------------------------------
> 1-2*pwilcox(max(1,qwilcox(0.1/2,10,10))-1,10,10)
[1] 0.9107904
---[/nitpicking]--------------------------------


What I'm very curious of is that scatter between the runs. Seems that one has to simulate at least 100 000 or more datasets to get a reliable coverage.
To much beer in case of coin:wilcox_exac() :-D.

BTW: with round to five decimals I have observed anyhow up to 10 sets with ties among the 10 000 sims.

Regards,

Detlew
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2010-09-16 16:01
(4942 d 08:07 ago)

@ d_labes
Posting: # 5918
Views: 20,099
 

 simul ants with no ties

Dear D. Labes!

❝ Yeah, but ... without ties.


Right...

❝ Each run (10 000 sims) only approx. 100 s!


Everybody seems to have a faster machine than I do. Took mine ~115". Should ask my boss for a new one.

❝ What I'm very curious of is that scatter between the runs. Seems that one has to simulate at least 100 000 or more datasets to get a reliable coverage.


Interesting.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
martin
★★  

Austria,
2010-09-16 17:35
(4942 d 06:33 ago)

@ Helmut
Posting: # 5919
Views: 19,938
 

 two different approaches

dear d_labes and HS !

I included two additional bootstrap based approaches implemented in R package pairwiseCI where investigation of empirical coverage of bootstrap based CIs is a real stress test (7 hours)

set.seed(150910)
n <- 10
nsim <- 1E4
conf.level <- 0.9
cover <- rep(0,4)
grp <- as.factor(c(rep('A', n), rep('B', n)))
for(i in 1:nsim){
    timeForecast(i, nsim)
    x <- round(rnorm(2*n, mean=0, sd=1),1)
    res1 <- confint(wilcox_test(x~grp, distribution='exact',
            conf.int=TRUE, conf.level=conf.level))$conf.int
    res2 <- wilcox.exact(x~grp, conf.int=TRUE,
            conf.level=conf.level)$conf.int
    res3 <- Median.diff(x=x[1:n], y=x[(n+1):(2*n)],
            conf.level=conf.level)$conf.int
    res4 <- HD.diff(x=x[1:n], y=x[(n+1):(2*n)],
            conf.level=conf.level)$conf.int
    if(res1[1] <= 0 & res1[2] >= 0){cover[1] <- cover[1] + 1}
    if(res2[1] <= 0 & res2[2] >= 0){cover[2] <- cover[2] + 1}
    if(res3[1] <= 0 & res3[2] >= 0){cover[3] <- cover[3] + 1}
    if(res4[1] <= 0 & res4[2] >= 0){cover[4] <- cover[4] + 1}
  }
-> estimated simulation time <h:mm:ss> (done / all):   0:00:00 / -:--:--
   -:--:-- / 7:02:49 
print(cover/nsim)
[1] 0.9283 0.9372 0.9319 0.9053


the empirical coverage of 0.9053 for the difference of Harrell-Davis estimates for location looks nice.

best regards

martin

PS.: you can interpret the empirical coverage as a proportion based on a sample size of nsim enabling to calculate a CI enabling to assess simulation accuracy

binom.test(x=0.9053*nsim, n=nsim, p=0.9)

        Exact binomial test

data:  0.9053 * nsim and nsim
number of successes = 9053, number of trials = 10000, p-value = 0.07729
alternative hypothesis: true probability of success is not equal to 0.9
95 percent confidence interval:
 0.8993926 0.9109701
sample estimates:
probability of success
                0.9053
martin
★★  

Austria,
2010-09-16 20:04
(4942 d 04:04 ago)

@ d_labes
Posting: # 5920
Views: 19,980
 

 evaluation of tmax: use of relative effects?

dear d_labes and HS!

I spend some time on thinking how to compare tmax and have some doubts that using the difference between average values (e.g. medians, Hodges-Lehmann or Harrell-Davis estimate) is appropriate as the effect size in tmax between test and reference depends on the sampling times used.

For this reason I would suggest using relative effects (e.g. implemented in R package nparcomp).

Definition (taken from this presentation http://www.biopharmnet.com/doc/2010_05_20_webinar.pdf ):
"The relative effect is a a measure of how often a random subject receiving treatment X will outperform a random subject receiving treatment Y. It can be interpreted as a probability that a randomly selected patient in the control reveals a smaller response value than a randomly selected patient in the treatment group"

naive application of d_labes dataset:

npar.t.test(pdiff~sequence, conflevel = 0.1, data=PKt)

Nonparametric Behrens-Fisher Problem
NOTE:
*-----------Analysis of relative effects-----------*
- Confidence interval for relative effect p(i,j)
     with confidence level 0.9
- Method =  Delta-Method (Logit)
- p.perm = p-value of the Neubert-Brunner permutation test
- p-Values for  H_0: p(i,j)=1/2
 
*----------------Interpretation--------------------*
p(a,b) > 1/2 : b tends to be larger than a
*-------------------Wilcox.Test--------------------*
- Asymptotic Wilcoxon Test
- In this setup you can only test H_0:F_i = F_j
*--------------------------------------------------*
$Data.Info
  Sample Size
1     RT    7
2     TR    7

$Analysis.of.relative.effects
  Comparison rel.effect confidence.interval   t.value p.value p.perm
1   p(RT,TR)      0.633   [ 0.354 ; 0.844 ] 0.7804535   0.435   0.41

$Wilcoxon.Test
  Comparison rel.effect   p.value
1   p(RT,TR)      0.633 0.4345742


what do you think about using relative effects for comparing tmax values between test and reference?

best regards

martin
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2010-09-16 21:52
(4942 d 02:16 ago)

@ martin
Posting: # 5921
Views: 19,847
 

 Not positive about that

Dear Martin!

❝ I spend some time on thinking how to compare tmax and have some doubts that using the difference between average values (e.g. medians, Hodges-Lehmann or Harrell-Davis estimate) is appropriate as the effect size in tmax between test and reference depends on the sampling times used.


Well, you know better than I do that we are comparing not medians, but the median of ranked differences - that's the HL-estimator. Of course the comparison depends on the sampling schedule. But: keep in mind that tmax (and <nitpicking>Cmax</nitpicking> as well) are surrogates of the rate of absorption ka - which we cannot access by NCA. For an example see this post.

❝ [...] I would suggest using relative effects...

❝ Definition (taken from this presentation http://www.biopharmnet.com/doc/2010_05_20_webinar.pdf )...:


Heavy stuff to swallow for an amateur. :cool:

❝ "The relative effect is a a measure of how often a random subject receiving treatment X will outperform a random subject receiving treatment Y. It can be interpreted as a probability that a randomly selected patient in the control reveals a smaller response value than a randomly selected patient in the treatment group"


I'm not happy with this definition.

$Analysis.of.relative.effects

  Comparison rel.effect confidence.interval   t.value p.value p.perm

1   p(RT,TR)      0.633   [ 0.354 ; 0.844 ] 0.7804535   0.435   0.41


$Wilcoxon.Test

  Comparison rel.effect   p.value

1   p(RT,TR)      0.633 0.4345742


❝ what do you think about using relative effects for comparing tmax values between test and reference?


I'm not familiar with the coding. The problem IMHO comes from the results given as ratios. Commonly for tmax a linear model is postulated (besides PK theory think about the way we speak: 'The maximum concentration of treatment A was observed one hour earlier than the one of treatment B' – not 'The time point of the maximum concentration of treatment A was 63.3% of treatment B's.'). Now if we get a percentage, to which location parameter should we apply it? To the median of the reference – or what?

Interesting to read the presentation, page 90:

The choice of the difference-based relevance threshold delta is nasty, use ratio-to-control instead.

Nasty or not, I don't think that regulators are happy with a multiplicative model. And so am I.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
martin
★★  

Austria,
2010-09-16 23:00
(4942 d 01:08 ago)

@ Helmut
Posting: # 5922
Views: 19,802
 

 Not positive about that

Dear HS!

thank you for your point of view, I just have a minor comment regarding the interpretation of relative effects. It's a probability that a randomly selected subject with R has a larger tmax value than a randomly selected subject with T (no effect would be indicated by a relative effect of 0.5).

for the given dataset, the probability of a randomly selected subject with R having a larger tmax value than a randomly selected subject with T is 63.3% (90% CI: 35.4% to 84.4%).

the other way round, the probability of a randomly selected subject with T having a smaller tmax value than a randomly selected subject with R is 63.3% (90% CI: 35.4% to 84.4%).

best regards

martin

PS.: this is a naive application to the dataset - some adjustments may necessary to account for the cross over design
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2010-09-17 15:10
(4941 d 08:58 ago)

@ martin
Posting: # 5923
Views: 19,752
 

 Stupidity

Dear Martin!

❝ [...] It's a probability that a randomly selected subject with R has a larger tmax value than a randomly selected subject with T (no effect would be indicated by a relative effect of 0.5).


Oh, I see! I mixed up probability with the size of the effect. But regulators are only interested in the latter!

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Jack
☆    
Homepage
Lancaster, United Kingdom,
2010-09-20 16:03
(4938 d 08:05 ago)

@ d_labes
Posting: # 5924
Views: 19,746
 

 tmax in case of ties: R vs. R vs. SAS

Dear d_labes,

here is my attempt to shed some light into the difference in the point estimates.

The coin packages uses the midpoint of the confidence interval as the point estimate for the location difference (ie (0.5-1.5)/2=-0.5 while the exactRankTests packages uses (sort of) the median of the observed differences as the point estimate. Note that the exact procedure of exactRankTests is quite situational which is why the difference in point estimates only occurs sometimes.

The real issue from my point of view is that it is hard (impossible?) to decide which is the "correct" estimate. Intuition tells me that the exactRankTest version is the one that is closer to the spirit of the Wilcoxon test while practicality certainly speaks for the midpoint of the CI to be used.

As for the differences in confidence intervals I presume that it has to do with either the approximation which is uses if n>50 and/or the continuity correction.
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2010-09-20 16:20
(4938 d 07:48 ago)

@ Jack
Posting: # 5925
Views: 19,715
 

 R packages

Hi Jack!

❝ The coin packages uses the midpoint of the confidence interval as the point estimate for the location difference (ie (0.5-1.5)/2=-0.5 while the exactRankTests packages uses (sort of) the median of the observed differences as the point estimate. Note that the exact procedure of exactRankTests is quite situational which is why the difference in point estimates only occurs sometimes.


Interesting!

❝ The real issue from my point of view is that it is hard (impossible?) to decide which is the "correct" estimate. Intuition tells me that the exactRankTest version is the one that is closer to the spirit of the Wilcoxon test while practicality certainly speaks for the midpoint of the CI to be used.


Hhm; in those ol' days I plotted the sampling distribution – and it can be quite asymetrical. I'm not so sure whether the midpoint of the CI makes sense if the location shift is concerned.

❝ As for the differences in confidence intervals I presume that it has to do with either the approximation which is uses if n>50 and/or the continuity correction.


I'm wondering whether there's a cut-off implemented in both packages switching from the exact method (based on permutation) to an approximation due to performance reasons? I know that it's possible to dump the code from the library, but my knowledge of R is too limited (even to use the right keywords for a search).

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2010-09-27 13:27
(4931 d 10:41 ago)

@ Helmut
Posting: # 5938
Views: 19,538
 

 R packages code

Dear Helmut!

❝ I'm wondering whether there's a cut-off implemented in both packages switching from the exact method (based on permutation) to an approximation due to performance reasons? I know that it's possible to dump the code from the library, but my knowledge of R is too limited (even to use the right keywords for a search).


It is described in the help pages for wilcox.exact.
Use getS3method("wilcox.exact","default") to see the code.

For coin it is more difficult because of the S4 implementation and deserves more puzzling detective work. Especially the theoretic framework must be known in detail to follow the implementation pieces. Out of scope for a run-of-the-mill applied statistician like me.

Regards,

Detlew
Jack
☆    
Homepage
Lancaster, United Kingdom,
2010-09-27 18:14
(4931 d 05:54 ago)

@ Helmut
Posting: # 5940
Views: 19,583
 

 R packages

Unfortunately I have not found any documentation, but was getting interested what was going on and had a look in the source code. Needless the code is quite involved...
d_labes
★★★

Berlin, Germany,
2010-09-27 11:53
(4931 d 12:15 ago)

@ Jack
Posting: # 5937
Views: 19,662
 

 Is Exact exact?

Dear Jack,

was on holiday thus I can answer but now.
Thanks for sharing your view.

❝ The coin packages uses the midpoint of the confidence interval as the point estimate for the location difference (ie (0.5-1.5)/2=-0.5 while the exactRankTests packages uses (sort of) the median of the observed differences as the point estimate.


Interesting as Helmut already said. Is it documented elsewhere?

Interesting enough the SAS Proc npar1way gives me both point estimates, Hodges-Lehmann median of the pairwise differences and the CI midpoint. The latter attributed to Lehmann:

Lehmann, E. L. (1963).
"Nonparametric Confidence Intervals for a Shift Parameter"
Annals of Mathematical Statistics, 34, 1507-1512

As noticed in the example above ... (sort of) ... is different from the HL estimate in case of ties.

❝ As for the differences in confidence intervals I presume that it has to do with either the approximation which is uses if n>50 and/or the continuity correction.


No. The example I have shown has n<50 and in that case the calculations will be done "exact" whatever that "exact" means.

Here another peculiarity of the coin package:
library(coin)

set.seed(130910)
n1  <- 10; n2 <-10
mu1 <- 0; mu2 <-0
grp <- as.factor(c(rep('A', n1), rep('B', n2)))
ptm <- proc.time()
nsim <- 500
equalCI <- 0
for(i in 1:nsim){
  x <- c(round(rnorm(n1, mean=mu1, sd=1),1), round(rnorm(n2, mean=mu2, sd=1),1))
  res1 <- confint(wilcox_test(x~grp,
          distribution=exact(algorithm = c("shift")),
          conf.int=TRUE, conf.level=0.90))$conf.int
  res1a <- confint(wilcox_test(x~grp,
          distribution=exact(algorithm = c("split-up")),
          conf.int=TRUE, conf.level=0.90))$conf.int
  if(all.equal(res1,res1a)==TRUE) {equalCI <- equalCI+1}
  if (1000*trunc(i/1000)==i) cat("\n")
  if (50*trunc(i/50)==i & i<nsim) cat(i+1,"") # give a sign of alive
}
cat("\n\n")
cat("shift vs. split-up: equal CIs",equalCI/nsim)
cat("\nsims time\n")
print(proc.time()-ptm)


This gives:
shift vs. split-up: equal CIs 0.73
sims time
   user  system elapsed
 182.67    0.23  184.70


Uuuups! :confused:

Regards,

Detlew
UA Flag
Activity
 Admin contact
22,957 posts in 4,819 threads, 1,636 registered users;
73 visitors (0 registered, 73 guests [including 11 identified bots]).
Forum time: 23:09 CET (Europe/Vienna)

Nothing shows a lack of mathematical education more
than an overly precise calculation.    Carl Friedrich Gauß

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5