Relaxation ★ Germany, 2015-02-20 17:31 (3324 d 15:31 ago) Posting: # 14474 Views: 7,956 |
|
Hello everybody and a nice evening. I have to comment on the Release Notes of Phoenix WinNonlin 6.4 from August last year at the moment. In the "What's new" section I saw that Certara implemented tests for equal variances for parallel designs and also for 2x2 cross-over studies. And this is accompanied by some explanation in the handbook on how to proceed in case a significant value is observed (basically, you implement a random/repeated specification for period and group by formulation). As I am not a statistician and, frankly, never came across a discussion of this test I tried to find out something about its importance in bioequivalence/rel. bioavailability. Fortunately, after some unsuccessful searching in the net and this forum, I found some information in Chow & Liu 3rd edition (p.196) and, although I fail to understand most of the discussion , that seems to imply, that this test is not only relevant in pop/ind equivalence (found it only for these in Hauschke, Steinijans & Pigeot) but also for testing the intra-individual variance in a simple 2x2x2 cross-over study (not inter!) for average BE. Uhm, however, the consequences of rejection of H0 seem to be missing. Whatever, as a natural consequence, I tried to figure out whether the recommended adaptation of the evaluation as given for Phoenix WinNonlin actually has any effect on the (BE) result at all simply using recent data sets (as said, I don't understand the formula sufficiently, have to try and compare ). I was not able to perform the testing for unequal variances, but some variabilities (for Reference or Test) at least "look" different. And there was no effect at all in PE or CIs . Thus, I wonder :
Best regards, Steven. |
Helmut ★★★ Vienna, Austria, 2015-02-20 19:56 (3324 d 13:06 ago) @ Relaxation Posting: # 14478 Views: 7,458 |
|
Hi Relaxation, ❝ As I am not a statistician… As am I. Partly I feel guilty. For many years I was unhappy with WinNonlin’s setup of parallel designs and fired questions at Pharsight. FDA’s guidance (2001) states: For parallel designs […] equal variances should not be assumed. (Classical) WinNonlin and PHX/WNL by default apply the conventional t-test, which is sensitive to unequal variances and unequal group sizes. The conventional t-test is always liberal if compared to the Welch-Satterthwaite approximation. The setup was (and still is) misleading because – even if a user was aware of the issue – in theGeneral Options > Degrees of Freedom ⦿ Satterthwaite applies the t-test for equal variances. In July 2013 Linda Hughes posted a workaround on Pharsight’s Extranet (which is applicable to all versions of WinNonlin). End of 2013 Pharsight told me that they will implement tests for (un)equal variances – which is not necessarily the best idea.1 Any pretest ❝ […] this test is not only relevant in pop/ind equivalence (found it only for these in Hauschke, Steinijans & Pigeot) but also for testing the intra-individual variance in a simple 2x2x2 cross-over study (not inter!) for average BE. Uhm, however, the consequences of rejection of H0 seem to be missing. One of the assumptions in the conventional crossover BE model are equal variances of T and R. If they are unequal, the CI will be inflated. Bad luck. That’s why regulators don’t care (aka increase the sample size). If s²WT < s²WR a “good” test product will be punished by the “bad” reference. At the end this issue lead to reference-scaling. But:
❝ I was not able to perform the testing for unequal variances, but some variabilities (for Reference or Test) at least "look" different. What do you mean by “look different”? First of all we need a mixed-effects model (not the EMA-stuff). In some datasets I got no convergence. Clayton & Leslie’s (B; see there) worked and Sauter’s (A) crashed. Clayton’s is interesting. In the conventional analysis we get a CVW of 60.2% (Pitman-Morgan’s test is not significant, p 0.2209). Should we design the next study for such a high CV? The modified model gives us CVWR 48.0% and CVWT 71.3%. Have a look at Chow/Liu Ch.8. The high variability of the test likely is caused by the outlying subject 7. If we exclude this subject, CVW decreases to 43.6%. Hey, that’s pretty close to what we found for the reference in the full dataset. I have no idea how sensitive the Pitman-Morgan test is. I got significance only in our datasets E and H. However, the modified model looks interesting. Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful. George E.P. Box ❝ And there was no effect at all in PE or CIs . Don’t worry; there will not be any. The residual variance is identical as in the conventional model. ❝ • has such a test any meaning for the "final" outcome in BE testing, the estimate/CI of the ratio between treatments […]? No; see above. ❝ • could the proposed workaround be communicated to EU authorities anyway (this is clearly not an "all effects fixed" situation) Please no. Don’t open up a can of worms. Personally I think that treating subjects as a fixed effect is crap. ❝ • is this for information only comparable to the testing for period and sequence effects? Period effects are irrelevant (unless the study is extremely [sic] imbalanced). Luckily testing for sequence (aka unequal carry-over) effects went to the regulatory trash can 21 (‼) years after Freeman’s paper.5 ❝ Any suggestions or thoughts that set me thinking would be greatly appreciated This setup is useful if you want to get some insights about the properties of formulations (beyond their means). Although it is not a substitute for a replicate design it might give you some useful information (especially in product development studies). Note that Boddy et al.6 suggested reference-scaling based on a 2×2 design…
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Relaxation ★ Germany, 2015-02-24 14:06 (3320 d 18:56 ago) @ Helmut Posting: # 14499 Views: 6,588 |
|
Hello Helmut, hello forum members. Now I feel guilty to made you spent so much work on an answer, as I really thought for the more experienced folk this would be a one-liner. Just makes me even more thankful for your efforts and this forum. Yes, I think I got the importance of accounting (not necessarily testing) for equal variances in a parallel design (which is also implemented in WinNonlin 6.4). I just fail to intuitively understand, how the intraindividual variance for each product can be calculated without a replicated administration. However, that is likely due to me not able to understand the formulas properly and I should invest my own time here . Still, the discussion at the moment somehow calms me down. ❝ What do you mean by “look different”? Well, as I have no access to WNL 6.4 at the moment I could only try the work around in 6.3, but without getting the results of the PM test. And the variances given for Test and Reference as “1_2/2_1” looked “different”, e.g. with 0.0113 and 0.0061 in one of my data sets. ❝ Please no. Don’t open up a can of worms. Personally I think that treating subjects as a fixed effect is crap. […] ❝ Period effects are irrelevant […] sequence (aka unequal carry-over) […] garbage And still we have to submit the appropriate tests of all effects in the model. Hm, makes me wonder, if authorities are actually looking into the Core outputs regularly… And when we use the simple Core output, testing by PM will then be included also, as “The results […] are given in the Average Bioequivalence output worksheet and at the end of the Core output {Users Guide 6.4}.” From what I learned from other people, I can understand the condemnation of using a fixed effect for subjects. From some reports I saw I personally can appreciate, that in a 2x2 setting subjects who cannot contribute to the T/R comparison (missing data in one/two periods) will be omitted “automatically” instead of getting the missing data imputed. That’s a nice side effect making the evaluation keeping in line with the EU-GL. Now, I will just read your post again and implement those literature references that are missing to our library. Thanks again and best regards, Steven. |
ElMaestro ★★★ Denmark, 2015-02-24 15:28 (3320 d 17:33 ago) @ Relaxation Posting: # 14500 Views: 6,598 |
|
Hi all, I would gladly apply pretesting. It will only lead to inflation of type I errors if the alpha for both the pretest and the BE is brainlessly chosen to be 5%. Adjust them, and you're good to go in terms of the type I error. You will find wording in the guideline to the effect of not assuming variance homogeneity. It can along these lines be argued that you are not assuming it when you are testing for it and letting the outcome decide your next step. All is good. Choice of actual alpha's is of course a little tricky but that is a mere practicality. LMSTFY. — Pass or fail! ElMaestro |
Helmut ★★★ Vienna, Austria, 2015-02-24 17:05 (3320 d 15:57 ago) @ ElMaestro Posting: # 14502 Views: 6,492 |
|
Hi ElMaestro, ❝ It will only lead to inflation of type I errors if the alpha for both the pretest and the BE is brainlessly chosen to be 5%. Agree. ❝ Adjust them, and you're good to go in terms of the type I error. Agree again. ❝ Choice of actual alpha's is of course a little tricky but that is a mere practicality. The inflation can be nasty. See Ruxton’s Table 1. For N1, N2 (11, 21) and s1, s2 (4, 1) the Type I Error of the conventional t-test is 0.155 (!) OK, that’s extreme. ❝ LMSTFY. IMHO, too many variables. Save your efforts – unless you will publish an entire book full of tables covering any possible combination (GMR, power, CV, sample size ratio, s-ratio). — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Helmut ★★★ Vienna, Austria, 2015-02-24 15:51 (3320 d 17:11 ago) @ Relaxation Posting: # 14501 Views: 6,741 |
|
Hi Steven, ❝ I just fail to intuitively understand, how the intraindividual variance for each product can be calculated without a replicated administration. In a mixed-effects model (which we apply here) you could get this information. The method is similar to recovering information from incomplete data. Compare
❝ ❝ What do you mean by “look different”? ❝ ❝ […] the variances given for Test and Reference as “1_2/2_1” looked “different”, e.g. with 0.0113 and 0.0061 in one of my data sets. Hhm, not sure which coding you used. It should be: Random: Subject(Sequence) and Repeated: Period, Subject, Treatment . Have a look at the Parameter Key -table to find out treatments’ coding. With my coding Var(Period*Treatment*Subject)_21 is s²wR and Var(Period*Treatment*Subject)_22 is s²wT.❝ And still we have to submit the appropriate tests of all effects in the model. Hm, makes me wonder, if authorities are actually looking into the Core outputs regularly… ElMaestro would say that chances are 0.0000001% or lower. At least in the EU deficiency letters of the type “There is a significant sequence effect in ANOVA. Please justify.” almost stopped. ❝ And when we use the simple Core output, testing by PM will then be included also, as “The results […] are given in the Average Bioequivalence output worksheet and at the end of the Core output {Users Guide 6.4}.” Yep. I always use the core output myself (M$-Word export is awful). In v6.3 and earlier I deleted irrelevant or obsolete stuff (Westlake’s CI, Anderson-Hauck, “Power”). Had an SOP for it. Now I have an R-script for that. ❝ From what I learned from other people, I can understand the condemnation of using a fixed effect for subjects. If one looks only at the numbers, results are the same. Since I don’t want to make a statement about subjects in this particular study only but to extrapolate to the population I prefer a random effect. ❝ From some reports I saw I personally can appreciate, that in a 2x2 setting subjects who cannot contribute to the T/R comparison (missing data in one/two periods) will be omitted “automatically” instead of getting the missing data imputed. That’s a nice side effect making the evaluation keeping in line with the EU-GL. Yes. In v6.4 you can set it in the Preferences .LinMixBioequivalence > ☑ Default for 2×2 crossover set to all fixed effects — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
Relaxation ★ Germany, 2015-02-26 13:50 (3318 d 19:11 ago) @ Helmut Posting: # 14506 Views: 6,452 |
|
Hello everybody and in particular Helmut and ElMaestro. I really appreciate the recommendations on how to proceed in thinking (and learning) and will try the recommended comparison. ❝ Hhm, not sure which coding you used. The same, with identical outcome. Sorry, that 1_2 was a typo and should have been 2_2. Best regards, Steven. |