d_labes ★★★ Berlin, Germany, 2010-09-10 14:05 (5147 d 23:08 ago) Posting: # 5904 Views: 25,232 |
|
Dear all! Ok, this thread could also go into the category "R for BE/BA". But this here is Helmut's favorite . Despite the fact that non-parametrics will no longer be accepted by the EMA the world is bigger than Europe and many guidelines still require a non-parametric analysis of tmax (some of them obviously copied from the old EMEA guidance). It is well known1), that the analysis in case of the classical 2x2 cross-over is done by analysing the period differences (P1-P2) via a Wilcoxon rank sum test with the sequence as grouping factor. Since in case of tmax there are usually some/many equal values (ties in the non-parametric methods speak) the used software should be able to handle such. In exploring this within R using the tmax values from the single dose data of Bear I came up with: # Bear single dose data This tolds me: Package 'exactRankTests' is no longer under development. BTW: Because of lexical ordering RT vs. TR is tested and this estimates R-T. Thus we have to change the sign and exchange upper and lower bound to get them for T-R in which we are interested. Since I'm an obedient man (especially if my wife speaks to me ) I followed the polite suggestion got after loading package 'exactRankTests' and tried: library(coin) and this told me: Exact Wilcoxon Mann-Whitney Rank Sum Test Upps! Different point estimates . In other examples (no room to show here) with tied data I have also observed different CIs! Any body out there who knows whats going on here? I have searched the R support lists and found that observed by others. But the answers where at least not sufficient if not meaningless. I have found a stat. course website which states "There is an R function wilcox.exact ... It does not do confidence intervals or point estimates correctly in the presence of ties." but without an explanation. The new option HL (=Hodges Lehmann estimator) in my "Power to knoff" SAS® 9.2 Proc NPAR1way (code upon request) gives me between other stuff: but this estimates T-R because TR is tested versus RT here. 1)Hauschke et.al. "A distribution free procedure for the statistical analysis of bioequivalence studies" Int. J. Clin. Pharmacol. Vol. 28 (2) 1990, 72-78/Vol. 30, Suppl. 1, S37-43 Sorry for that lengthy post. But I intended to give the Rusers among us the possibility to verify my results with cut and paste. Sorry2: I have totally forgotten that the estimate and the confidence interval calculated as given are for 2*(T-R). Thus the results must be divided by 2! But what I wanted to show remains. — Regards, Detlew |
Helmut ★★★ Vienna, Austria, 2010-09-10 20:28 (5147 d 16:45 ago) @ d_labes Posting: # 5905 Views: 21,741 |
|
Dear D. Labes! ❝ Ok, this thread could also go into the category "R for BE/BA". But this here is Helmut's favorite . Right. But if ties are considered, you are stirring up a hornets' nest! ❝ Since in case of tmax there are usually some/many equal values (ties in the non-parametric methods speak) the used software should be able to handle such. Well, cough… Summary of your results (T-R) - changed signs if necessary and divided by 2: program PE 90.26% CI My quick and dirty solution: PKt$pdiff <- (PKt$P1 - PKt$P2)/2 ❝ Any body out there who knows whats going on here? Hhm; don't use library(exactRankTests) ?Hauschke et al. (1990) don't speak about ties; they refer to Hollander and Wolfe.1 In my edition the word 'tie' is not even mentioned in the subject index… With my mid-1990 implementation in Rocky Mountain BASIC (not corrected for ties) I get: PE 90.26% CI Now for StatXact 6.1. using the coding according to Senn (2002) I get: PE 90.26% CI What the heck? Phoenix 6.1 (asymptotic), according to Koch2 (no ties!): PE 90.26% CI EquivTest/PK (asymptotic): PE 90.26% CI Not a single word about ties in the manuals of Phoenix and EquivTest/PK. StatXact states: […] zero differences are ignored. We assume initially that there are no ties in the observed values of the Di's. We will show subsequently how to adjust for the possibility of ties. […] By suitably adapting the reasoning in Lehmann (1975, pages 181 to 182) so as to accommodate ties one can show that the probability that the interval [λ*, λ*] excludes λ is at most α. This ensures that the Hodges-Lehmann confidence intervals preserve the desired coverage probability. In a more recent textbook3 the common procedure of using mid-ranks for ties is given. According to this reference StatXact adjusts for ties. But I'm not so sure anymore. Open questions:
❝ Sorry for that lengthy post. So am I. You gave me a nice excuse not to spend the afternoon struggling with my income tax return.
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
d_labes ★★★ Berlin, Germany, 2010-09-13 15:21 (5144 d 21:52 ago) @ Helmut Posting: # 5906 Views: 21,408 |
|
Dear Helmut! Thanks for your very elaborate answer! ❝ Right. But if ties are considered, you are stirring up a hornets' nest! I see . ❝ My quick and dirty solution: ❝ ❝ ❝ Interesting: "relevel". Was not aware of that handy tool. But astonishing the result: A change in order gives a different lower bound! ❝ Hhm; don't use Seems a good recommendation. But it is claimed frequently that Rusers should use it in case of ties. Moreover coin has a much bigger footprint. ❝ ... they refer to Hollander and Wolfe. In my edition the word 'tie' is not even mentioned in the subject index... Very strange. Everyone writing about rank methods is citing this. ❝ With my mid-1990 implementation in Rocky Mountain BASIC (not ❝ corrected for ties) I get: ❝ ❝ Not 'correcting' for ties to obtain the CI is exactly what Charles Geyer recommends. With respect to the Wilcoxon sign rank test he states: "First, neither ties nor zeros should make any difference in calculating point estimators or confidence intervals.
And this may be transferable to the rank-sum test . If one follows his code in the interval section of the Wilcoxon rank sum test one gets your Rocky results. ❝ Now for StatXact 6.1. using the coding according to Senn (2002) ... What exactly did you here? Senn's Basic estimator? Then a test of sequence group differences would give an estimate of the period effect. Or am I wrong? See the expected period differences below. ❝ ... Open questions: ❝ • Signs... (Sigh!) I've checked the coding in all programs numerous times, ❝ I came across with: Sometimes the smaller group is used as the first, sometimes the coding order, sometimes the character order, sometimes the first occurred level. My uses the rule: smaller group and if both have the same n use the first occurred level as group 1. If this makes any sense is left to you. ❝ I guess the reason might be that you are comparing period differences rather than treatment differences. The reason for using the period differences rather then the treatment diffs comes from the paper of Hauschke. It is based on the expected values sequence P1-P2 Thus a test of the differences by sequence group gets rid of the period effects. This is not the case with the treatment diffs. ❝ ... I tried ❝ ❝ I also observed this. Although mentioned on the help page it seems that the ties.method is not available in the wilcoxon_test function and other location tests of package coin. ties.method is only accepted by the function normal_test() which performs the "Exact Normal Quantile (van der Waerden) Test". ❝ If I exclude subjects with a zero difference ... Called by Charles Geyer "The Zero fudge" . ❝ ... You gave me a nice excuse not to spend the afternoon struggling with my income tax return. If it is that horrible as here in Germanien I did you the favor very willingly . But the pity is that the problem with the Finanzamt does not vanish by sitting it out. — Regards, Detlew |
martin ★★ Austria, 2010-09-14 10:33 (5144 d 02:40 ago) @ d_labes Posting: # 5907 Views: 21,298 |
|
dear HS and d_labes! you may find the following simulation results interesting rm(list=ls(all=TRUE)) best regards martin |
d_labes ★★★ Berlin, Germany, 2010-09-15 12:14 (5143 d 00:59 ago) @ martin Posting: # 5909 Views: 21,154 |
|
Dear Martin! ❝ you may find the following simulation results interesting ❝ ... ❝ ❝ ❝ ❝ Wow! Really interesting. And unbelievable . I was told that the Wilcoxon rank sum test is due to the discrete nature of the permutation distribution function always to some degree conservative (achievable confidence with n1, n2=10 is 0.956743). And here it comes out that the implementation in the package coin is distinct anti-conservative! And the implementation in wilcox.exact does not show signs of conservativeness. I have one small doubt. I'm not quite sure but should we not include the CI borders in counting the coverage due to the definition of the CI? If we do this one gets (only 5000 sims, see BTW below): [1] 0.9608 # coverage of wilcox_test (coin) Moreover there seems to be some interaction between both packages. If I simulate without coin I get coverage of wilcox.exact = 0.9688 (including the CI borders in counting coverage). BTW: Warning to all! The code is really a stress-test for the CPU of your computer (at least for mine). Submit the code and go for one and another beers. Next morning the result is astonishing enough . Therefore I could answer not until today. — Regards, Detlew |
martin ★★ Austria, 2010-09-15 16:53 (5142 d 20:20 ago) @ d_labes Posting: # 5915 Views: 21,158 |
|
dear d_labes! ❝ I have one small doubt. I'm not quite sure but should we not include the CI borders in counting the coverage due to the definition of the CI? thank you very much for pointing this out; IMHO you are right - this is crucial in the case of ties. here is a little bit faster version of the simulation code (ca. 30 min on my CPU) date() and here I am, for all my lore, the wretched fool I was before (Goethe) martin |
Helmut ★★★ Vienna, Austria, 2010-09-15 18:17 (5142 d 18:56 ago) @ martin Posting: # 5916 Views: 21,197 |
|
Dear Martin! ❝ here is a little bit faster version of the simulation code (ca. 30 min on my CPU) Faster machine than mine (double Xeon 2.8GHz, 4GB RAM, 32bit XP Pro, hyperthreading on)... I would use conf.level=0.90 . At the start of the codeptm <- proc.time() and after the output
runtime <- proc.time() - ptm I got print(cover1/nsim) ❝ and here I am, for all my lore, the wretched fool I was before (Goethe) Yep. For n1=n2=10 I would expect 0.9108 ...— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
d_labes ★★★ Berlin, Germany, 2010-09-16 13:19 (5141 d 23:54 ago) @ Helmut Posting: # 5917 Views: 21,062 |
|
Dear Helmut! ❝ Yep. For n1=n2=10 I would expect Yeah, but ... without ties. 10 000 sims each with round(,5) in the code and only wilcox.exact(): run 1 0.9096 Each run (10 000 sims) only approx. 100 s! The mean seems very close to your expectation . ---[nitpicking]--------------------------------- What I'm very curious of is that scatter between the runs. Seems that one has to simulate at least 100 000 or more datasets to get a reliable coverage. To much beer in case of coin:wilcox_exac() . BTW: with round to five decimals I have observed anyhow up to 10 sets with ties among the 10 000 sims. — Regards, Detlew |
Helmut ★★★ Vienna, Austria, 2010-09-16 16:01 (5141 d 21:12 ago) @ d_labes Posting: # 5918 Views: 21,131 |
|
Dear D. Labes! ❝ Yeah, but ... without ties. Right... ❝ Each run (10 000 sims) only approx. 100 s! Everybody seems to have a faster machine than I do. Took mine ~115". Should ask my boss for a new one. ❝ What I'm very curious of is that scatter between the runs. Seems that one has to simulate at least 100 000 or more datasets to get a reliable coverage. Interesting. — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
martin ★★ Austria, 2010-09-16 17:35 (5141 d 19:38 ago) @ Helmut Posting: # 5919 Views: 20,970 |
|
dear d_labes and HS ! I included two additional bootstrap based approaches implemented in R package pairwiseCI where investigation of empirical coverage of bootstrap based CIs is a real stress test (7 hours)set.seed(150910) the empirical coverage of 0.9053 for the difference of Harrell-Davis estimates for location looks nice. best regards martin PS.: you can interpret the empirical coverage as a proportion based on a sample size of nsim enabling to calculate a CI enabling to assess simulation accuracybinom.test(x=0.9053*nsim, n=nsim, p=0.9) |
martin ★★ Austria, 2010-09-16 20:04 (5141 d 17:09 ago) @ d_labes Posting: # 5920 Views: 21,033 |
|
dear d_labes and HS! I spend some time on thinking how to compare tmax and have some doubts that using the difference between average values (e.g. medians, Hodges-Lehmann or Harrell-Davis estimate) is appropriate as the effect size in tmax between test and reference depends on the sampling times used. For this reason I would suggest using relative effects (e.g. implemented in R package nparcomp ).Definition (taken from this presentation http://www.biopharmnet.com/doc/2010_05_20_webinar.pdf ): "The relative effect is a a measure of how often a random subject receiving treatment X will outperform a random subject receiving treatment Y. It can be interpreted as a probability that a randomly selected patient in the control reveals a smaller response value than a randomly selected patient in the treatment group" naive application of d_labes dataset:
npar.t.test(pdiff~sequence, conflevel = 0.1, data=PKt) what do you think about using relative effects for comparing tmax values between test and reference? best regards martin |
Helmut ★★★ Vienna, Austria, 2010-09-16 21:52 (5141 d 15:21 ago) @ martin Posting: # 5921 Views: 20,900 |
|
Dear Martin! ❝ I spend some time on thinking how to compare tmax and have some doubts that using the difference between average values (e.g. medians, Hodges-Lehmann or Harrell-Davis estimate) is appropriate as the effect size in tmax between test and reference depends on the sampling times used. Well, you know better than I do that we are comparing not medians, but the median of ranked differences - that's the HL-estimator. Of course the comparison depends on the sampling schedule. But: keep in mind that tmax (and <nitpicking>Cmax</nitpicking> as well) are surrogates of the rate of absorption ka - which we cannot access by NCA. For an example see this post. ❝ [...] I would suggest using relative effects... ❝ Definition (taken from this presentation http://www.biopharmnet.com/doc/2010_05_20_webinar.pdf )...: Heavy stuff to swallow for an amateur. ❝ "The relative effect is a a measure of how often a random subject receiving treatment X will outperform a random subject receiving treatment Y. It can be interpreted as a probability that a randomly selected patient in the control reveals a smaller response value than a randomly selected patient in the treatment group" I'm not happy with this definition. ❝ ❝ ❝ ❝ ❝ ❝ ❝ ❝ ❝ what do you think about using relative effects for comparing tmax values between test and reference? I'm not familiar with the coding. The problem IMHO comes from the results given as ratios. Commonly for tmax a linear model is postulated (besides PK theory think about the way we speak: 'The maximum concentration of treatment A was observed one hour earlier than the one of treatment B' – not 'The time point of the maximum concentration of treatment A was 63.3% of treatment B's.'). Now if we get a percentage, to which location parameter should we apply it? To the median of the reference – or what? Interesting to read the presentation, page 90: The choice of the difference-based relevance threshold delta is nasty, use ratio-to-control instead. Nasty or not, I don't think that regulators are happy with a multiplicative model. And so am I.— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
martin ★★ Austria, 2010-09-16 23:00 (5141 d 14:13 ago) @ Helmut Posting: # 5922 Views: 20,828 |
|
Dear HS! thank you for your point of view, I just have a minor comment regarding the interpretation of relative effects. It's a probability that a randomly selected subject with R has a larger tmax value than a randomly selected subject with T (no effect would be indicated by a relative effect of 0.5). for the given dataset, the probability of a randomly selected subject with R having a larger tmax value than a randomly selected subject with T is 63.3% (90% CI: 35.4% to 84.4%). the other way round, the probability of a randomly selected subject with T having a smaller tmax value than a randomly selected subject with R is 63.3% (90% CI: 35.4% to 84.4%). best regards martin PS.: this is a naive application to the dataset - some adjustments may necessary to account for the cross over design |
Helmut ★★★ Vienna, Austria, 2010-09-17 15:10 (5140 d 22:03 ago) @ martin Posting: # 5923 Views: 20,773 |
|
Dear Martin! ❝ [...] It's a probability that a randomly selected subject with R has a larger tmax value than a randomly selected subject with T (no effect would be indicated by a relative effect of 0.5). Oh, I see! I mixed up probability with the size of the effect. But regulators are only interested in the latter! — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Helmut ★★★ Vienna, Austria, 2010-09-20 16:20 (5137 d 20:53 ago) @ Jack Posting: # 5925 Views: 20,754 |
|
Hi Jack! ❝ The coin packages uses the midpoint of the confidence interval as the point estimate for the location difference (ie (0.5-1.5)/2=-0.5 while the exactRankTests packages uses (sort of) the median of the observed differences as the point estimate. Note that the exact procedure of exactRankTests is quite situational which is why the difference in point estimates only occurs sometimes. Interesting! ❝ The real issue from my point of view is that it is hard (impossible?) to decide which is the "correct" estimate. Intuition tells me that the exactRankTest version is the one that is closer to the spirit of the Wilcoxon test while practicality certainly speaks for the midpoint of the CI to be used. Hhm; in those ol' days I plotted the sampling distribution – and it can be quite asymetrical. I'm not so sure whether the midpoint of the CI makes sense if the location shift is concerned. ❝ As for the differences in confidence intervals I presume that it has to do with either the approximation which is uses if n>50 and/or the continuity correction. I'm wondering whether there's a cut-off implemented in both packages switching from the exact method (based on permutation) to an approximation due to performance reasons? I know that it's possible to dump the code from the library, but my knowledge of R is too limited (even to use the right keywords for a search). — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
d_labes ★★★ Berlin, Germany, 2010-09-27 13:27 (5130 d 23:46 ago) @ Helmut Posting: # 5938 Views: 20,564 |
|
Dear Helmut! ❝ I'm wondering whether there's a cut-off implemented in both packages switching from the exact method (based on permutation) to an approximation due to performance reasons? I know that it's possible to dump the code from the library, but my knowledge of R is too limited (even to use the right keywords for a search). It is described in the help pages for wilcox.exact. Use getS3method("wilcox.exact","default") to see the code.For coin it is more difficult because of the S4 implementation and deserves more puzzling detective work. Especially the theoretic framework must be known in detail to follow the implementation pieces. Out of scope for a run-of-the-mill applied statistician like me. — Regards, Detlew |
d_labes ★★★ Berlin, Germany, 2010-09-27 11:53 (5131 d 01:20 ago) @ Jack Posting: # 5937 Views: 20,710 |
|
Dear Jack, was on holiday thus I can answer but now. Thanks for sharing your view. ❝ The coin packages uses the midpoint of the confidence interval as the point estimate for the location difference (ie (0.5-1.5)/2=-0.5 while the exactRankTests packages uses (sort of) the median of the observed differences as the point estimate. Interesting as Helmut already said. Is it documented elsewhere? Interesting enough the SAS Proc npar1way gives me both point estimates, Hodges-Lehmann median of the pairwise differences and the CI midpoint. The latter attributed to Lehmann:Lehmann, E. L. (1963). "Nonparametric Confidence Intervals for a Shift Parameter" Annals of Mathematical Statistics, 34, 1507-1512 As noticed in the example above ... (sort of) ... is different from the HL estimate in case of ties. ❝ As for the differences in confidence intervals I presume that it has to do with either the approximation which is uses if n>50 and/or the continuity correction. No. The example I have shown has n<50 and in that case the calculations will be done "exact" whatever that "exact" means. Here another peculiarity of the coin package: library(coin) This gives: shift vs. split-up: equal CIs 0.73 Uuuups! — Regards, Detlew |