d_labes
★★★

Berlin, Germany,
2010-09-10 10:05
(3930 d 17:07 ago)

(edited by d_labes on 2010-09-10 11:06)
Posting: # 5904
Views: 20,692

## tmax in case of ties: R vs. R vs. SAS [Nonparametrics]

Dear all!

Ok, this thread could also go into the category "R for BE/BA". But this here is Helmut's favorite .

Despite the fact that non-parametrics will no longer be accepted by the EMA the world is bigger than Europe and many guidelines still require a non-parametric analysis of tmax (some of them obviously copied from the old EMEA guidance).

It is well known1), that the analysis in case of the classical 2x2 cross-over is done by analysing the period differences (P1-P2) via a Wilcoxon rank sum test with the sequence as grouping factor.

Since in case of tmax there are usually some/many equal values (ties in the non-parametric methods speak) the used software should be able to handle such.
In exploring this within R using the tmax values from the single dose data of Bear I came up with:
# Bear single dose data
data <- c("
subject sequence      P2     P1
1       TR      2.0    2.0
2       RT      1.5    3.0
3       TR      2.0    2.0
4       RT      2.0    2.0
5       TR      3.0    2.0
6       RT      3.0    2.0
7       TR      1.5    2.0
8       RT      3.0    2.0
9       TR      2.0    3.0
10       RT      2.0    1.5
11       TR      1.5    2.0
12       RT      2.0    3.0
13       TR      1.5    3.0
14       RT      3.0    3.0")
tc <- textConnection(data)
close(tc)

PKt$pdiff <- PKt$P1 - PKt$P2 # to run it again we must remove coin if ("package:coin" %in% search()) detach("package:coin",unload=TRUE) library(exactRankTests) t1 <- wilcox.exact(pdiff ~ sequence, data=PKt, conf.int=TRUE, conf.level=0.9) print(t1) This tolds me: Package 'exactRankTests' is no longer under development. Please consider using package 'coin' instead. Exact Wilcoxon rank sum test data: pdiff by sequence W = 18, p-value = 0.4149 alternative hypothesis: true mu is not equal to 0 90 percent confidence interval: -1.5 0.5 sample estimates: difference in location -0.25 BTW: Because of lexical ordering RT vs. TR is tested and this estimates R-T. Thus we have to change the sign and exchange upper and lower bound to get them for T-R in which we are interested. Since I'm an obedient man (especially if my wife speaks to me ) I followed the polite suggestion got after loading package 'exactRankTests' and tried: library(coin) PKt$sequence <- as.factor(PKt$sequence) #very important t2 <- wilcox_test(pdiff ~ sequence, data=PKt, conf.int=TRUE, conf.level=0.9, distribution="exact") print(t2) and this told me: Exact Wilcoxon Mann-Whitney Rank Sum Test data: pdiff by sequence (RT, TR) Z = -0.8465, p-value = 0.4149 alternative hypothesis: true mu is not equal to 0 90 percent confidence interval: -1.5 0.5 sample estimates: difference in location -0.5 Upps! Different point estimates . In other examples (no room to show here) with tied data I have also observed different CIs! Any body out there who knows whats going on here? I have searched the R support lists and found that observed by others. But the answers where at least not sufficient if not meaningless. I have found a stat. course website which states "There is an R function wilcox.exact ... It does not do confidence intervals or point estimates correctly in the presence of ties." but without an explanation. The new option HL (=Hodges Lehmann estimator) in my "Power to knoff" SAS® 9.2 Proc NPAR1way (code upon request) gives me between other stuff: The NPAR1WAY Procedure ... Average scores were used for ties. Wilcoxon Two-Sample Test Statistic (S) 59.0000 Exact Test One-Sided Pr >= S 0.2075 Two-Sided Pr >= |S - Mean| 0.4149 ... Hodges-Lehmann Estimation Location Shift 0.5000 Interval Asymptotic Type 90% Confidence Limits Midpoint Standard Error Asymptotic (Moses) -0.5000 1.5000 0.500 0.6080 Exact -0.5000 1.5000 0.500 but this estimates T-R because TR is tested versus RT here. 1)Hauschke et.al. "A distribution free procedure for the statistical analysis of bioequivalence studies" Int. J. Clin. Pharmacol. Vol. 28 (2) 1990, 72-78/Vol. 30, Suppl. 1, S37-43 Sorry for that lengthy post. But I intended to give the Rusers among us the possibility to verify my results with cut and paste. Sorry2: I have totally forgotten that the estimate and the confidence interval calculated as given are for 2*(T-R). Thus the results must be divided by 2! But what I wanted to show remains. Regards, Detlew Helmut ★★★ Vienna, Austria, 2010-09-10 16:28 (3930 d 10:44 ago) @ d_labes Posting: # 5905 Views: 18,040 ## tmax in case of ties: StatXact, Phoenix, and... Dear D. Labes! » Ok, this thread could also go into the category "R for BE/BA". But this here is Helmut's favorite . Right. But if ties are considered, you are stirring up a hornets' nest! » Since in case of tmax there are usually some/many equal values (ties in the non-parametric methods speak) the used software should be able to handle such. Well, cough… Summary of your results (T-R) - changed signs if necessary and divided by 2: program PE 90.26% CI library(exactRankTests) +0.125 -0.25 +0.75 library(coin) +0.25 -0.25 +0.75 SAS (asymptotic) +0.25 -0.25 +0.75 SAS (exact) +0.25 -0.25 +0.75 My quick and dirty solution: PKt$pdiff    <- (PKt$P1 - PKt$P2)/2
PKt$sequence <- relevel(PKt$sequence, ref="TR")
PKt$sequence [1] TR RT TR RT TR RT TR RT TR RT TR RT TR RT Levels: TR RT require(exactRankTests) t1 <- wilcox.exact(pdiff ~ sequence, data=PKt, conf.int=TRUE, conf.level=0.9) print(t1) Exact Wilcoxon rank sum test data: pdiff by sequence W = 31, p-value = 0.4149 alternative hypothesis: true mu is not equal to 0 90 percent confidence interval: -0.50 0.75 sample estimates: difference in location 0.125 require(coin) t2 <- wilcox_test(pdiff ~ sequence, data=PKt, conf.int=TRUE, conf.level=0.9, distribution="exact") print(t2) Exact Wilcoxon Mann-Whitney Rank Sum Test data: pdiff by sequence (TR, RT) Z = 0.8465, p-value = 0.4149 alternative hypothesis: true mu is not equal to 0 90 percent confidence interval: -0.25 0.75 sample estimates: difference in location 0.25 » Any body out there who knows whats going on here? Hhm; don't use library(exactRankTests)? Hauschke et al. (1990) don't speak about ties; they refer to Hollander and Wolfe.1 In my edition the word 'tie' is not even mentioned in the subject index… With my mid-1990 implementation in Rocky Mountain BASIC (not corrected for ties) I get: PE 90.26% CI +0.25 -0.25 +0.75 Now for StatXact 6.1. using the coding according to Senn (2002) I get: PE 90.26% CI asymptotic +0.50 -0.25 +1.00 exact +0.50 -0.25 +1.00 What the hell? Phoenix 6.1 (asymptotic), according to Koch2 (no ties!): PE 90.26% CI +0.25 -0.25 +0.75 EquivTest/PK (asymptotic): PE 90.26% CI +0.25 -0.25 +0.75 Not a single word about ties in the manuals of Phoenix and EquivTest/PK. StatXact states: […] zero differences are ignored. We assume initially that there are no ties in the observed values of the Di's. We will show subsequently how to adjust for the possibility of ties. […] By suitably adapting the reasoning in Lehmann (1975, pages 181 to 182) so as to accommodate ties one can show that the probability that the interval [λ*, λ*] excludes λ is at most α. This ensures that the Hodges-Lehmann confidence intervals preserve the desired coverage probability. In a more recent textbook3 the common procedure of using mid-ranks for ties is given. According to this reference StatXact adjusts for ties. But I'm not so sure anymore. Open questions: • Signs… (sigh!) I've checked the coding in all programs numerous times, but don't know what's going on here. Fishbrain. I guess the reason might be that you are comparing period differences rather than treatment differences. Since we have a balanced case here, it might explain the opposite signs. • In my understanding wilcox_test() does not correct for ties. I tried ties.method="mid-ranks" and ties.method="average-scores" and got: Warning: In independence_test.IndependenceProblem(object, teststat = "scalar", : additional arguments ties.method will be ignored • Apart from the signs, the numerical values are identical with the exception of StatXact's results and the PE in R's exactRankTests. We have four out of 14 cases with a zero difference. If I manually exclude these cases from the dataset I get exactly the same results as for the full dataset. • If I exclude subjects with a zero difference (1, 3, 104, 14) I get StatXact's results in all (!) other programs… So what do we have here? One program (maybe!) adjusts for ties, but excludes zero differences. » Sorry for that lengthy post. So am I. You gave me a nice excuse not to spend the afternoon struggling with my income tax return. 1. Holander M, Wolfe DA. Nonparametric Statistical Methods. New York: Wiley; 1973. 2. Koch GG. The Use of Non-Parametric Methods in the Statistical Analysis of the Two-Period Change-Over Design. Biometrics. 1972:28;578–85. PMID 4556704. 3. Sprent P, Smeeton NC. Applied Nonparametric Statistical Methods. Boca Raton: Chapman & Hall/CRC; 4th ed. 2007. Dif-tor heh smusma 🖖 Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes d_labes ★★★ Berlin, Germany, 2010-09-13 11:21 (3927 d 15:50 ago) @ Helmut Posting: # 5906 Views: 17,877 ## Ties, no ties, ties, no ties ... Dear Helmut! Thanks for your very elaborate answer! » Right. But if ties are considered, you are stirring up a hornets' nest! I see . » My quick and dirty solution: » PKt$pdiff    <- (PKt$P1 - PKt$P2)/2
» PKt$sequence <- relevel(PKt$sequence, ref="TR")
» ...

Interesting: "relevel". Was not aware of that handy tool.
But astonishing the result: A change in order gives a different lower bound!

» Hhm; don't use library(exactRankTests)?

Seems a good recommendation. But it is claimed frequently that Rusers should use it in case of ties. Moreover coin has a much bigger footprint.

» ... they refer to Hollander and Wolfe. In my edition the word 'tie' is not even mentioned in the subject index...

Very strange. Everyone writing about rank methods is citing this.

» With my mid-1990 implementation in Rocky Mountain BASIC (not
» corrected for ties) I get:
»    PE      90.26% CI
»   0.25    -0.25 +0.75

Not 'correcting' for ties to obtain the CI is exactly what Charles Geyer recommends. With respect to the Wilcoxon sign rank test he states:
"First, neither ties nor zeros should make any difference in calculating point estimators or confidence intervals.
• The point estimate is the median of the Walsh averages ...
• The end points of the confidence interval are k in from each end of the sorted Walsh averages ...
• Ties and zeros affect only hypothesis tests.
This is a bit of programmer brain damage (PBD) in the implementation of the wilcox.test and wilcox.exact functions. They change the way they calculate point estimates and confidence intervals when there are ties or zeros. But they shouldn't"
.
And this may be transferable to the rank-sum test .
If one follows his code in the interval section of the Wilcoxon rank sum test one gets your Rocky results.

» Now for StatXact 6.1. using the coding according to Senn (2002) ...

What exactly did you here? Senn's Basic estimator? Then a test of sequence group differences would give an estimate of the period effect. Or am I wrong? See the expected period differences below.

» ... Open questions:
» • Signs... (Sigh!) I've checked the coding in all programs numerous times,
»   but don't know what's going on here. Fishbrain.

I came across with: Sometimes the smaller group is used as the first, sometimes the coding order, sometimes the character order, sometimes the first occurred level.
My uses the rule: smaller group and if both have the same n use the first occurred level as group 1. If this makes any sense is left to you.

» I guess the reason might be that you are comparing period differences rather than treatment differences.

The reason for using the period differences rather then the treatment diffs comes from the paper of Hauschke. It is based on the expected values
sequence   P1-P2
TR       µ(T-R) + p1-p2
RT      -µ(T-R) + p1-p2

Thus a test of the differences by sequence group gets rid of the period effects. This is not the case with the treatment diffs.

» ... I tried ties.method="mid-ranks" and ties.method="average-scores" and got:
» Warning: In independence_test.IndependenceProblem(object, teststat = "scalar",  :
»   additional arguments ties.method will be ignored

I also observed this. Although mentioned on the help page it seems that the ties.method is not available in the wilcoxon_test function and other location tests of package coin. ties.method is only accepted by the function normal_test() which performs the "Exact Normal Quantile (van der Waerden) Test".

» If I exclude subjects with a zero difference ...

Called by Charles Geyer "The Zero fudge" .

» ... You gave me a nice excuse not to spend the afternoon struggling with my income tax return.

If it is that horrible as here in Germanien I did you the favor very willingly . But the pity is that the problem with the Finanzamt does not vanish by sitting it out.

Regards,

Detlew
martin
★★

Austria,
2010-09-14 06:33
(3926 d 20:39 ago)

@ d_labes
Posting: # 5907
Views: 17,687

## simulation

dear HS and d_labes!

you may find the following simulation results interesting

rm(list=ls(all=TRUE))
library(coin)
library(exactRankTests)

set.seed(130910)
n <- 10
nsim <- 1E4
cover1 <- 0
cover2 <- 0
grp <- as.factor(c(rep('A', n), rep('B', n)))
for(i in 1:nsim){
x <- c(round(rnorm(n, mean=0, sd=1),1), round(rnorm(n, mean=0, sd=1),1))
res1 <- confint(wilcox_test(x~grp, distribution='exact', conf.int=TRUE,
conf.level=0.95))$conf.int res2 <- wilcox.exact(x~grp, conf.int=TRUE, conf.level=0.95)$conf.int
if(res1[1] < 0 & res1[2] > 0){cover1 <- cover1 + 1}
if(res2[1] < 0 & res2[2] > 0){cover2 <- cover2 + 1}
}
print(cover1/nsim)
[1] 0.937
print(cover2/nsim)
[1] 0.9505

prop.test(x=c(cover1, cover2), n=c(nsim, nsim))

2-sample test for equality of proportions with continuity correction

data:  c(cover1, cover2) out of c(nsim, nsim)
X-squared = 16.9122, df = 1, p-value = 3.915e-05
alternative hypothesis: two.sided
95 percent confidence interval:
-0.01998361 -0.00701639
sample estimates:
prop 1 prop 2
0.9370 0.9505

best regards
martin
d_labes
★★★

Berlin, Germany,
2010-09-15 08:14
(3925 d 18:57 ago)

@ martin
Posting: # 5909
Views: 17,625

## simulants of the world unite

Dear Martin!

» you may find the following simulation results interesting
» ...
» print(cover1/nsim)
» [1] 0.937    # coverage of wilcox_test (coin)
» print(cover2/nsim)
» [1] 0.9505   # coverage of wilcox.exact (exactRankTests)
Wow! Really interesting. And unbelievable .
I was told that the Wilcoxon rank sum test is due to the discrete nature of the permutation distribution function always to some degree conservative (achievable confidence with n1, n2=10 is 0.956743).
And here it comes out that the implementation in the package coin is distinct anti-conservative! And the implementation in wilcox.exact does not show signs of conservativeness.

I have one small doubt. I'm not quite sure but should we not include the CI borders in counting the coverage due to the definition of the CI?
If we do this one gets (only 5000 sims, see BTW below):
[1] 0.9608   # coverage of wilcox_test (coin)
[1] 0.9608   # coverage of wilcox.exact (exactRankTests)

Moreover there seems to be some interaction between both packages. If I simulate without coin I get coverage of wilcox.exact = 0.9688 (including the CI borders in counting coverage).

BTW: Warning to all! The code is really a stress-test for the CPU of your computer (at least for mine). Submit the code and go for one and another beers. Next morning the result is astonishing enough .
Therefore I could answer not until today.

Regards,

Detlew
martin
★★

Austria,
2010-09-15 12:53
(3925 d 14:19 ago)

@ d_labes
Posting: # 5915
Views: 17,635

## simulants of the world unite

dear d_labes!

» I have one small doubt. I'm not quite sure but should we not include the CI borders in counting the coverage due to the definition of the CI?

thank you very much for pointing this out; IMHO you are right - this is crucial in the case of ties. here is a little bit faster version of the simulation code (ca. 30 min on my CPU)

date()
[1] "Wed Sep 15 14:16:44 2010"
set.seed(130910)
n <- 10
nsim <- 1E4
cover1 <- 0
cover2 <- 0
grp <- as.factor(c(rep('A', n), rep('B', n)))
for(i in 1:nsim){
x <- round(rnorm(2*n, mean=0, sd=1),1)
res1 <- confint(wilcox_test(x~grp, distribution='exact', conf.int=TRUE,
conf.level=0.95))$conf.int res2 <- wilcox.exact(x~grp, conf.int=TRUE, conf.level=0.95)$conf.int
if(res1[1] <= 0 & res1[2] >= 0){cover1 <- cover1 + 1}
if(res2[1] <= 0 & res2[2] >= 0){cover2 <- cover2 + 1}
}
print(cover1/nsim)
[1] 0.9609
print(cover2/nsim)
[1] 0.9682
date()
[1] "Wed Sep 15 14:46:26 2010"

and here I am, for all my lore, the wretched fool I was before (Goethe)

martin
Helmut
★★★

Vienna, Austria,
2010-09-15 14:17
(3925 d 12:55 ago)

@ martin
Posting: # 5916
Views: 17,656

## simulants of the world unite

Dear Martin!

» here is a little bit faster version of the simulation code (ca. 30 min on my CPU)

Faster machine than mine (double Xeon 2.8GHz, 4GB RAM, 32bit XP Pro, hyperthreading on)...

I would use conf.level=0.90. At the start of the code
ptm <- proc.time()
and after the output
runtime <- proc.time() - ptm
cat("Runtime",runtime[1]/60,"minutes\n")

I got
print(cover1/nsim)
[1] 0.9288
print(cover2/nsim)
[1] 0.9359
Runtime 32.90417 minutes

» and here I am, for all my lore, the wretched fool I was before (Goethe)

Yep. For n1=n2=10 I would expect 0.9108...

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2010-09-16 09:19
(3924 d 17:53 ago)

@ Helmut
Posting: # 5917
Views: 17,533

## simul ants with no ties

Dear Helmut!

» Yep. For n1=n2=10 I would expect 0.9108...

Yeah, but ... without ties.
10 000 sims each with round(,5) in the code and only wilcox.exact():
run 1  0.9096
run 2  0.9076
run 3  0.9125
run 4  0.9104
run 5  0.9138
-------------
mean   0.91078

Each run (10 000 sims) only approx. 100 s!

The mean seems very close to your expectation .
---[nitpicking]---------------------------------
> 1-2*pwilcox(max(1,qwilcox(0.1/2,10,10))-1,10,10)
[1] 0.9107904
---[/nitpicking]--------------------------------

What I'm very curious of is that scatter between the runs. Seems that one has to simulate at least 100 000 or more datasets to get a reliable coverage.
To much beer in case of coin:wilcox_exac() .

BTW: with round to five decimals I have observed anyhow up to 10 sets with ties among the 10 000 sims.

Regards,

Detlew
Helmut
★★★

Vienna, Austria,
2010-09-16 12:01
(3924 d 15:11 ago)

@ d_labes
Posting: # 5918
Views: 17,561

## simul ants with no ties

Dear D. Labes!

» Yeah, but ... without ties.

Right...

» Each run (10 000 sims) only approx. 100 s!

Everybody seems to have a faster machine than I do. Took mine ~115". Should ask my boss for a new one.

» What I'm very curious of is that scatter between the runs. Seems that one has to simulate at least 100 000 or more datasets to get a reliable coverage.

Interesting.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
martin
★★

Austria,
2010-09-16 13:35
(3924 d 13:37 ago)

(edited by martin on 2010-09-16 14:03)
@ Helmut
Posting: # 5919
Views: 17,437

## two different approaches

dear d_labes and HS !

I included two additional bootstrap based approaches implemented in R package pairwiseCI where investigation of empirical coverage of bootstrap based CIs is a real stress test (7 hours)

set.seed(150910)
n <- 10
nsim <- 1E4
conf.level <- 0.9
cover <- rep(0,4)
grp <- as.factor(c(rep('A', n), rep('B', n)))
for(i in 1:nsim){
timeForecast(i, nsim)
x <- round(rnorm(2*n, mean=0, sd=1),1)
res1 <- confint(wilcox_test(x~grp, distribution='exact',
conf.int=TRUE, conf.level=conf.level))$conf.int res2 <- wilcox.exact(x~grp, conf.int=TRUE, conf.level=conf.level)$conf.int
res3 <- Median.diff(x=x[1:n], y=x[(n+1):(2*n)],
conf.level=conf.level)$conf.int res4 <- HD.diff(x=x[1:n], y=x[(n+1):(2*n)], conf.level=conf.level)$conf.int
if(res1[1] <= 0 & res1[2] >= 0){cover[1] <- cover[1] + 1}
if(res2[1] <= 0 & res2[2] >= 0){cover[2] <- cover[2] + 1}
if(res3[1] <= 0 & res3[2] >= 0){cover[3] <- cover[3] + 1}
if(res4[1] <= 0 & res4[2] >= 0){cover[4] <- cover[4] + 1}
}
-> estimated simulation time <h:mm:ss> (done / all):   0:00:00 / -:--:--
-:--:-- / 7:02:49
print(cover/nsim)
[1] 0.9283 0.9372 0.9319 0.9053

the empirical coverage of 0.9053 for the difference of Harrell-Davis estimates for location looks nice.

best regards

martin

PS.: you can interpret the empirical coverage as a proportion based on a sample size of nsim enabling to calculate a CI enabling to assess simulation accuracy

binom.test(x=0.9053*nsim, n=nsim, p=0.9)

Exact binomial test

data:  0.9053 * nsim and nsim
number of successes = 9053, number of trials = 10000, p-value = 0.07729
alternative hypothesis: true probability of success is not equal to 0.9
95 percent confidence interval:
0.8993926 0.9109701
sample estimates:
probability of success
0.9053
martin
★★

Austria,
2010-09-16 16:04
(3924 d 11:08 ago)

@ d_labes
Posting: # 5920
Views: 17,481

## evaluation of tmax: use of relative effects?

dear d_labes and HS!

I spend some time on thinking how to compare tmax and have some doubts that using the difference between average values (e.g. medians, Hodges-Lehmann or Harrell-Davis estimate) is appropriate as the effect size in tmax between test and reference depends on the sampling times used.

For this reason I would suggest using relative effects (e.g. implemented in R package nparcomp).

Definition (taken from this presentation http://www.biopharmnet.com/doc/2010_05_20_webinar.pdf ):
"The relative effect is a a measure of how often a random subject receiving treatment X will outperform a random subject receiving treatment Y. It can be interpreted as a probability that a randomly selected patient in the control reveals a smaller response value than a randomly selected patient in the treatment group"

naive application of d_labes dataset:

npar.t.test(pdiff~sequence, conflevel = 0.1, data=PKt)

Nonparametric Behrens-Fisher Problem
NOTE:
*-----------Analysis of relative effects-----------*
- Confidence interval for relative effect p(i,j)
with confidence level 0.9
- Method =  Delta-Method (Logit)
- p.perm = p-value of the Neubert-Brunner permutation test
- p-Values for  H_0: p(i,j)=1/2

*----------------Interpretation--------------------*
p(a,b) > 1/2 : b tends to be larger than a
*-------------------Wilcox.Test--------------------*
- Asymptotic Wilcoxon Test
- In this setup you can only test H_0:F_i = F_j
*--------------------------------------------------*
$Data.Info Sample Size 1 RT 7 2 TR 7$Analysis.of.relative.effects
Comparison rel.effect confidence.interval   t.value p.value p.perm
1   p(RT,TR)      0.633   [ 0.354 ; 0.844 ] 0.7804535   0.435   0.41

$Wilcoxon.Test Comparison rel.effect p.value 1 p(RT,TR) 0.633 0.4345742 what do you think about using relative effects for comparing tmax values between test and reference? best regards martin Helmut ★★★ Vienna, Austria, 2010-09-16 17:52 (3924 d 09:20 ago) @ martin Posting: # 5921 Views: 17,352 ## Not positive about that Dear Martin! » I spend some time on thinking how to compare tmax and have some doubts that using the difference between average values (e.g. medians, Hodges-Lehmann or Harrell-Davis estimate) is appropriate as the effect size in tmax between test and reference depends on the sampling times used. Well, you know better than I do that we are comparing not medians, but the median of ranked differences - that's the HL-estimator. Of course the comparison depends on the sampling schedule. But: keep in mind that tmax (and <nitpicking>Cmax</nitpicking> as well) are surrogates of the rate of absorption ka - which we cannot access by NCA. For an example see this post. » [...] I would suggest using relative effects... » Definition (taken from this presentation http://www.biopharmnet.com/doc/2010_05_20_webinar.pdf )...: Heavy stuff to swallow for an amateur. » "The relative effect is a a measure of how often a random subject receiving treatment X will outperform a random subject receiving treatment Y. It can be interpreted as a probability that a randomly selected patient in the control reveals a smaller response value than a randomly selected patient in the treatment group" I'm not happy with this definition. »$Analysis.of.relative.effects
»   Comparison rel.effect confidence.interval   t.value p.value p.perm
» 1   p(RT,TR)      0.633   [ 0.354 ; 0.844 ] 0.7804535   0.435   0.41
»
» $Wilcoxon.Test » Comparison rel.effect p.value » 1 p(RT,TR) 0.633 0.4345742 » » what do you think about using relative effects for comparing tmax values between test and reference? I'm not familiar with the coding. The problem IMHO comes from the results given as ratios. Commonly for tmax a linear model is postulated (besides PK theory think about the way we speak: 'The maximum concentration of treatment A was observed one hour earlier than the one of treatment B' – not 'The time point of the maximum concentration of treatment A was 63.3% of treatment B's.'). Now if we get a percentage, to which location parameter should we apply it? To the median of the reference – or what? Interesting to read the presentation, page 90: The choice of the difference-based relevance threshold delta is nasty, use ratio-to-control instead. Nasty or not, I don't think that regulators are happy with a multiplicative model. And so am I. Dif-tor heh smusma 🖖 Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes martin ★★ Austria, 2010-09-16 19:00 (3924 d 08:12 ago) (edited by martin on 2010-09-16 20:14) @ Helmut Posting: # 5922 Views: 17,300 ## Not positive about that Dear HS! thank you for your point of view, I just have a minor comment regarding the interpretation of relative effects. It's a probability that a randomly selected subject with R has a larger tmax value than a randomly selected subject with T (no effect would be indicated by a relative effect of 0.5). for the given dataset, the probability of a randomly selected subject with R having a larger tmax value than a randomly selected subject with T is 63.3% (90% CI: 35.4% to 84.4%). the other way round, the probability of a randomly selected subject with T having a smaller tmax value than a randomly selected subject with R is 63.3% (90% CI: 35.4% to 84.4%). best regards martin PS.: this is a naive application to the dataset - some adjustments may necessary to account for the cross over design Helmut ★★★ Vienna, Austria, 2010-09-17 11:10 (3923 d 16:01 ago) @ martin Posting: # 5923 Views: 17,259 ## Stupidity Dear Martin! » [...] It's a probability that a randomly selected subject with R has a larger tmax value than a randomly selected subject with T (no effect would be indicated by a relative effect of 0.5). Oh, I see! I mixed up probability with the size of the effect. But regulators are only interested in the latter! Dif-tor heh smusma 🖖 Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes Jack ☆ Lancaster, United Kingdom, 2010-09-20 12:03 (3920 d 15:09 ago) @ d_labes Posting: # 5924 Views: 17,241 ## tmax in case of ties: R vs. R vs. SAS Dear d_labes, here is my attempt to shed some light into the difference in the point estimates. The coin packages uses the midpoint of the confidence interval as the point estimate for the location difference (ie (0.5-1.5)/2=-0.5 while the exactRankTests packages uses (sort of) the median of the observed differences as the point estimate. Note that the exact procedure of exactRankTests is quite situational which is why the difference in point estimates only occurs sometimes. The real issue from my point of view is that it is hard (impossible?) to decide which is the "correct" estimate. Intuition tells me that the exactRankTest version is the one that is closer to the spirit of the Wilcoxon test while practicality certainly speaks for the midpoint of the CI to be used. As for the differences in confidence intervals I presume that it has to do with either the approximation which is uses if n>50 and/or the continuity correction. Helmut ★★★ Vienna, Austria, 2010-09-20 12:20 (3920 d 14:52 ago) @ Jack Posting: # 5925 Views: 17,220 ## R packages Hi Jack! » The coin packages uses the midpoint of the confidence interval as the point estimate for the location difference (ie (0.5-1.5)/2=-0.5 while the exactRankTests packages uses (sort of) the median of the observed differences as the point estimate. Note that the exact procedure of exactRankTests is quite situational which is why the difference in point estimates only occurs sometimes. Interesting! » The real issue from my point of view is that it is hard (impossible?) to decide which is the "correct" estimate. Intuition tells me that the exactRankTest version is the one that is closer to the spirit of the Wilcoxon test while practicality certainly speaks for the midpoint of the CI to be used. Hhm; in those ol' days I plotted the sampling distribution – and it can be quite asymetrical. I'm not so sure whether the midpoint of the CI makes sense if the location shift is concerned. » As for the differences in confidence intervals I presume that it has to do with either the approximation which is uses if n>50 and/or the continuity correction. I'm wondering whether there's a cut-off implemented in both packages switching from the exact method (based on permutation) to an approximation due to performance reasons? I know that it's possible to dump the code from the library, but my knowledge of R is too limited (even to use the right keywords for a search). Dif-tor heh smusma 🖖 Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes d_labes ★★★ Berlin, Germany, 2010-09-27 09:27 (3913 d 17:45 ago) @ Helmut Posting: # 5938 Views: 17,044 ## R packages code Dear Helmut! » I'm wondering whether there's a cut-off implemented in both packages switching from the exact method (based on permutation) to an approximation due to performance reasons? I know that it's possible to dump the code from the library, but my knowledge of R is too limited (even to use the right keywords for a search). It is described in the help pages for wilcox.exact. Use getS3method("wilcox.exact","default") to see the code. For coin it is more difficult because of the S4 implementation and deserves more puzzling detective work. Especially the theoretic framework must be known in detail to follow the implementation pieces. Out of scope for a run-of-the-mill applied statistician like me. Regards, Detlew Jack ☆ Lancaster, United Kingdom, 2010-09-27 14:14 (3913 d 12:57 ago) @ Helmut Posting: # 5940 Views: 17,029 ## R packages Unfortunately I have not found any documentation, but was getting interested what was going on and had a look in the source code. Needless the code is quite involved... d_labes ★★★ Berlin, Germany, 2010-09-27 07:53 (3913 d 19:19 ago) @ Jack Posting: # 5937 Views: 17,095 ## Is Exact exact? Dear Jack, was on holiday thus I can answer but now. Thanks for sharing your view. » The coin packages uses the midpoint of the confidence interval as the point estimate for the location difference (ie (0.5-1.5)/2=-0.5 while the exactRankTests packages uses (sort of) the median of the observed differences as the point estimate. Interesting as Helmut already said. Is it documented elsewhere? Interesting enough the SAS Proc npar1way gives me both point estimates, Hodges-Lehmann median of the pairwise differences and the CI midpoint. The latter attributed to Lehmann: Lehmann, E. L. (1963). "Nonparametric Confidence Intervals for a Shift Parameter" Annals of Mathematical Statistics, 34, 1507-1512 As noticed in the example above ... (sort of) ... is different from the HL estimate in case of ties. » As for the differences in confidence intervals I presume that it has to do with either the approximation which is uses if n>50 and/or the continuity correction. No. The example I have shown has n<50 and in that case the calculations will be done "exact" whatever that "exact" means. Here another peculiarity of the coin package: library(coin) set.seed(130910) n1 <- 10; n2 <-10 mu1 <- 0; mu2 <-0 grp <- as.factor(c(rep('A', n1), rep('B', n2))) ptm <- proc.time() nsim <- 500 equalCI <- 0 for(i in 1:nsim){ x <- c(round(rnorm(n1, mean=mu1, sd=1),1), round(rnorm(n2, mean=mu2, sd=1),1)) res1 <- confint(wilcox_test(x~grp, distribution=exact(algorithm = c("shift")), conf.int=TRUE, conf.level=0.90))$conf.int
res1a <- confint(wilcox_test(x~grp,
distribution=exact(algorithm = c("split-up")),
conf.int=TRUE, conf.level=0.90))\$conf.int
if(all.equal(res1,res1a)==TRUE) {equalCI <- equalCI+1}
if (1000*trunc(i/1000)==i) cat("\n")
if (50*trunc(i/50)==i & i<nsim) cat(i+1,"") # give a sign of alive
}
cat("\n\n")
cat("shift vs. split-up: equal CIs",equalCI/nsim)
cat("\nsims time\n")
print(proc.time()-ptm)

This gives:
shift vs. split-up: equal CIs 0.73
sims time
user  system elapsed
182.67    0.23  184.70

Uuuups!

Regards,

Detlew