victor ☆ Malaysia, 2019-11-16 22:57 (1787 d 15:22 ago) Posting: # 20813 Views: 9,281 |
|
Hi everyone! I'm new to pharmacokinetics, and I'm wondering what is the largest α (Alpha) & β (Beta) allowed by FDA, for each of the three hypothesis tests illustrated below (with each α & β highlighted Where:
Thanks in advance P.S. if you spot any mistake in my illustration below, could you kindly inform me as well? ଘ(੭*ˊᵕˋ)੭* ̀ˋ P.S. The following post didn't submit correctly, even though the preview for it was working. So, I decided to screenshot my question instead. Hope it is acceptable :) Edit: Category changed; see also this post #1. Link to 643KiB 2,000px photo deleted and changed to a downscaled variant. [Helmut] |
Helmut ★★★ Vienna, Austria, 2019-11-17 02:26 (1787 d 11:54 ago) @ victor Posting: # 20814 Views: 7,933 |
|
Hi Victor, I tried to reconstruct your original post as good as I could. Since it was broken before the first “\(\mathcal{A}\)”, I guess you used an UTF-16 character whereas the forum is coded in UTF-8. Please don’t link to large images breaking the layout of the posting area and forcing us to scroll our viewport. THX. I think that your approach has same flaws.
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
victor ☆ Malaysia, 2019-11-17 11:53 (1787 d 02:26 ago) (edited on 2019-11-17 14:16) @ Helmut Posting: # 20815 Views: 7,983 |
|
❝ Hi Victor, I tried to reconstruct your original post as good as I could. Since it was broken before the first “\(\mathcal{A}\)”, I guess you used an UTF-16 character whereas the forum is coded in UTF-8. Hi Helmut, Thanks for helping me out :) Edit: after a quick experiment (click here to see screenshot), it seems that the “\(\mathcal{A}\)” I used was a UTF-8 character after all? ⊙.☉ ❝ Please don’t link to large images breaking the layout of the posting area and forcing us to scroll our viewport. THX. Noted, and thanks for downscaling my original image :) ❝ I think that your approach has same flaws.
I see; I thought it would make sense for Tmax to also be transformed because of googling stuff like this: coupled with the fact that the population distribution that is being analyzed looks a lot like a Log-normal distribution; so I thought normalizing Tmax just made sense, since almost all distributions studied in undergraduate (e.g. F-distribution used by ANOVA) are ultimately transformations of one or more standard normals. With that said, is the above stuff that I googled, wrong? ❝
Thanks for enlightening me that I can now restate the current standard's hypothesis in a "more familiar (undergraduate-level)" form: $$H_0: ln(\mu_T) - ln(\mu_R) \notin \left [ ln(\theta_1), ln(\theta_2) \right ]\:vs\:H_1: ln(\theta_1) < ln(\mu_T) - ln(\mu_R) < ln(\theta_2)$$ I now realize that I was actually using the old standard's hypothesis (whose null tested for bioequivalence, instead of the current standard's null for bioinequivalence), which had problems with their α & β (highlighted in red below, cropped from this paper), thus rendering my initial question pointless, because I was analyzing an old problem; i.e. before Hauck and Anderson's 1984 paper. ❝
With that said, regarding the old standard's hypothesis (whose null tested for bioequivalence), I was originally curious (although it may be a meaningless problem now, but I'm still curious) on how they bounded the family-wise error rate (FWER) if α=5% for each hypothesis test, since the probability of committing one or more type I errors when performing three hypothesis tests = 1 - (1-α)^3 = 1 - (1-0.05)^3 = 14.26% (if those three hypothesis tests were actually independent). The same question more importantly applied to β, since in the old standard's hypothesis (whose null tested for bioequivalence), "the consumer’s risk is defined as the probability (β) of accepting a formulation which is bioinequivalent, i.e. accepting H0 when H0 is false (Type II error)." (as quoted from page 212 of the same paper). Do you know how FDA bounded the "global" α & β before 1984? Because I am curious on "what kind of secret math technique" was happening behind-the-scenes that allowed 12 random-samples to be considered "good enough by the FDA"; i.e.
Thanks in advance :) ଘ(੭*ˊᵕˋ)੭* ̀ˋ |
Helmut ★★★ Vienna, Austria, 2019-11-17 15:35 (1786 d 22:45 ago) @ victor Posting: # 20816 Views: 8,222 |
|
Hi Victor, ❝ I did use UTF-8 though, because the following HTML works, and I could save (and reopen) it using my Windows 10's notepad.exe under UTF-8 encoding; but […] Duno. The mysteries of HTML/CSS/php/MySQL. ❝ ❝ Please don’t link to large images breaking the layout of the posting area and forcing us to scroll our viewport. THX. ❝ ❝ Noted, and thanks for downscaling my original image :) Sorry if the downscaled image shows poor legibility. The one in full resolution here. ❝ I thought it would make sense for Tmax to also be transformed because of googling stuff like this: ❝ Aha! A presentation by Mr Concordet of 2004. Heteroscedasticity refers to more than one distribution. A single distribution might be skewed; I guess that is what was meant. When we apply a parametric method (ANOVA, t-tests) one of the assumptions – as you correctly stated in your graph – is that residual errors follow a normal distribution. It makes sense to assume that PK metrics like AUC and Cmax follow a log-normal distribution since concentrations are bound to \(\mathbb{R}^+\) (negative ones don’t exist and zero is excluded). However, even if this assumption would be wrong, the only important thing is that the model’s residuals are approximately normal, i.e., \(\epsilon \approx \mathcal{N}(0,1)\). It should be noted that the t-test is fairly robust against heteroscedasticity but very sensitive to unequal sequences (crossover) and group sizes (parallel design). That’s why the FDA recommends in any case Satterthwaite’s approximation of the degrees of freedom. tmax1 is yet another story. The distribution strongly depends on the study’s sampling schedule but is definitely discrete (i.e., on an ordinal scale). Given, the underlying one likely is continuous2 but we simply don’t have an infinite number of samples in NCA. A log-transformation for discrete distributions is simply not allowed. Hence, what is stated in this slide is wrong. Many people opt for one of the variants of the Wilcoxon test to assess the difference. Not necessarily correct. The comparison of the shift in locations is only valid if distributions are equal. If not, one has to opt for the Brunner-Munzel test3 (available in the R package nparcomp ).❝ … coupled with the fact that the population distribution that is being analyzed looks a lot like a Log-normal distribution; Wait a minute. It is possible to see concentration-time profiles as distributions resulting from a stochastic process. The usual statistical moments are:$$S_0=\int f(x)dx$$ $$S_1=\int x\cdot f(x)dx$$ $$S_2=\int x^2\cdot f(x)dx$$ where in PK \(x=t\) and \(f(x)=C\) leads to \(S_0=AUC\) and \(S_1=AUMC\). AFAIK, there is no particular term for \(S_2\) in PK though it is – rarely – used to calculate the “Variance of Residence Times” as \(VRT=S_2/S_0-(S_1/S_0)^2\). The intersection of \(MRT\) (the vertical line) and \(VRT\) (the horizontal line) is the distribution’s “Center of Gravity”. Print it and stick a needle trough it. Makes a nice whizz wheel.4 I know only one approach trying to directly compare profiles based on moment-theory (the Kullback-Leibler information criterion).5 Never tried it with real data but I guess one might face problems with variability. Les Benet once stated (too lazy to search where and when) that for a reliable estimate of AUC one has to sample in such a way that the extrapolated fraction is 5–20%. For AUMC one would need 1–5%. No idea about VRT but in my experience its variability is extreme. ❝ … so I thought normalizing Tmax just made sense, […]. With that said, is the above stuff that I googled, wrong? Yes it is because such a transformation for a discrete distribution (like tmax) is not allowed. ❝ […] I can now restate the current standard's hypothesis in a "more familiar (undergraduate-level)" form: ❝ $$H_0: ln(\mu_T) - ln(\mu_R) \notin \left [ ln(\theta_1), ln(\theta_2) \right ]\:vs\:H_1: ln(\theta_1) < ln(\mu_T) - ln(\mu_R) < ln(\theta_2)$$ ❝ I now realize that I was actually using the old standard's hypothesis (whose null tested for bioequivalence, instead of the current standard's null for bioinequivalence) […] Correct, again. ❝ […] regarding the old standard's hypothesis (whose null tested for bioequivalence), I was originally curious (although it may be a meaningless problem now, but I'm still curious) on how they bounded the family-wise error rate (FWER) if α=5% for each hypothesis test, since the probability of committing one or more type I errors when performing three hypothesis tests = 1 - (1-α)^3 = 1 - (1-0.05)^3 = 14.26% (if those three hypothesis tests were actually independent). Here you err. We are not performing independent tests (which would call for Bonferroni’s adjustment or similar) but have to pass all tests. Hence, the IUT keeps the FWER at 5% or lower. Actually it is more conservative than the single tests themselves and will get more conservative the more the PK metrics differ. See the R code provided by Martin in this post for an example. ❝ The same question more importantly applied to β, since in the old standard's hypothesis (whose null tested for bioequivalence), "the consumer’s risk is defined as the probability (β) of accepting a formulation which is bioinequivalent, i.e. accepting H0 when H0 is false (Type II error)." (as quoted from page 212 of the same paper). Correct but gone with the wind. See this post how the BE requirements of the FDA evolved. ❝ Do you know how the FDA bounded the "global" α & β before 1984? α was always 0.05. In the – very – early days of the CI-inclusion approach some people wrongly used a 95% CI, which actually implies α 0.025. As the name tells the FDA’s 80/20 rule required ≥80% power or β ≤20%. Nowadays post hoc (a posteriori, retrospective) power is irrelevant. Power (1–β) is important only in designing a study where in most jurisdictions 80–90% (β 10–20%) are recommended. This allows to answer your original question: ❝ ❝ ❝ What is the largest α & β allowed by FDA? α 0.05 (since it is fixed in the method) and β is not assessed. It can be very high (i.e., low “power”) if the assumptions leading to the sample size were not realized in the study (e.g., larger deviation of T from R and/or higher variability than assumed, higher number of dropouts than anticipated). However, quoting ElMaestro: Being lucky is not a crime. On the other hand, a very high producer’s risk in designing a study is like gambling and against ICH E9:The number of subjects in a clinical trial should always be large enough to provide a reliable answer to the questions addressed. Hopefully such a protocol is rejected by the IEC.❝ […] I am curious on "what kind of secret math technique" was happening behind-the-scenes that allowed 12 random-samples to be considered "good enough by FDA". No idea. Rooted in Babylonian numbers? See also this post. For reference-scaling the FDA requires at least 24 subjects. A minimum sample 12 is recommended in all jurisdictions. IMHO, not based on statistics (large power for low variability and T/R close to 1) but out of the blue.
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
victor ☆ Malaysia, 2019-11-18 09:26 (1786 d 04:53 ago) @ Helmut Posting: # 20817 Views: 7,856 |
|
Wow! I'm blown away by your amazing answers (and reply speed)! Since it will take me time to properly digest and reply all the details you just shared (especially on IUT, because IUT is new to me, and I think IUT might be the missing link I was searching for to deal with dependencies in multiple hypothesis testing), hence I decided to just respond to specific concepts, one at a time :) Also, thanks for confirming which concepts I grasped correctly, because as someone new to pharmacokinetics, it definitely helps me eliminate logic errors (。♥‿♥。) For now, I'd like to dig a little deeper to understand all the logic errors(?) that I made regarding tmax. ❝
Good point :) I initially chose to use Tmax because I'm used to writing random variables in uppercase, to help myself catch for syntax errors if I ever wrote something silly like E[t], when t is not a random variable, etc. With that said...(I rearranged your reply below to fit the flow better) ❝ ❝ … coupled with the fact that the population distribution that is being analyzed looks a lot like a Log-normal distribution; so I thought normalizing Tmax just made sense, since almost all distributions studied in undergraduate (e.g. F-distribution used by ANOVA) are ultimately transformations of one or more standard normals. ❝ ❝ Yes it is because such a transformation for a discrete distribution (like tmax) is not allowed. ❝ The distribution strongly depends on the study’s sampling schedule but is definitely discrete. Given, the underlying one likely is continuous2 but we simply don’t have an infinite number of samples in NCA. A log-transformation for discrete distributions is simply not allowed. Hence, what is stated in this slide is wrong. Many people opt for one of the variants of the Wilcoxon test to assess the difference. Not necessarily correct. The comparison of the shift in locations is only valid if distributions are equal. If not, one has to opt for the Brunner-Munzel test3 (available in the R package Does it mean that the Nyquist criterion is not satisfied in pharmacokinetics (in terms of sampling interval)? Couldn't we just design our study’s sampling schedule to ensure that Nyquist criterion is satisfied so that we can perfectly reconstruct the original continuous-time function from the samples? ❝ Wait a minute. It is possible to see concentration-time profiles as distributions resulting from a stochastic process. […] I know only one approach trying to directly compare profiles based on moment-theory (the Kullback-Leibler information criterion).5 Yes, I originally viewed the population distribution that is being analyzed as a stochastic process, but I didn't dive in too deep to check if the moment-generating function of a log-normal distribution "matches" the (concentration-time profile?) model used in pharmacokinetics (because I'm still learning about the models used in pharmacokinetics), so it was more of a "visually looks a lot like a log-normal distribution" on my part :p With that said, this specific fact you shared intrigues me... ❝ Les Benet once stated (too lazy to search where and when) that for a reliable estimate of AUC one has to sample in such a way that the extrapolated fraction is 5–20%. For AUMC one would need 1–5%. No idea about VRT but in my experience its variability is extreme. I'm familiar with Taylor Series Approximations, Fourier series approximation, etc. but never really thought about how the error terms of a moment-generating function is controlled if I use, for example, a log-normal distribution to describe the concentration-time profile. It might be obvious (but I don't recall learning it in undergraduate though, did I?), but I kinda prefer spending my time learning about IUT instead of thinking about this :p, so I was wondering if you have any good reference materials for learning about how to correctly use the moment-generating function to model a distribution in real-life? (e.g. concentration-time profile) Fun fact: as I was briefly reading about Kullback–Leibler divergence, I thought it looked familiar, then I realized I first encountered it when reading about IIT Thanks once again for all your amazing replies! ଘ(੭*ˊᵕˋ)੭* ̀ˋ |
Helmut ★★★ Vienna, Austria, 2019-11-18 16:09 (1785 d 22:10 ago) @ victor Posting: # 20821 Views: 7,844 |
|
Hi Victor, ❝ ❝ Edit: after a quick experiment (click here to see screenshot), it seems that the “\(\mathcal{A}\)” I used was a UTF-8 character after all? ⊙.☉ Correct, and the mystery is resolved. bin 11110000 10011101 10010011 10010000 hex F09D9390 When I paste the character it is shown in the preview but not in the post. The field of the database-table is of type utf8_general_ci , supporting only characters with a length of 3 bytes, whereas yours has 4. That’s it. In the development forum (in German, sorry) we realized that changing the type to utf8mb4_general_ci (neither of the field, the table, or the entire DB) alone can resolve the issue. It requires to rewrite all parts of the scripts handling the connection php/MySQL. Not easy and not my top priority.❝ I'm blown away by your amazing answers (and reply speed)! My pleasure. ❝ Since it will take me time to properly digest and reply all the details you just shared (especially on IUT, because IUT is new to me, and I think IUT might be the missing link I was searching for to deal with dependencies in multiple hypothesis testing), … It’s not that complicated. Let’s explore the plot: We have three tests. The areas give their type I errors. Since we perform all at the same level, the areas are identical. The Euclidean distance between centers give the correlation of PK metrics (here they are identical as well). The FWER is given by the area of the intersection which in any case will be ≤ the nominal α. In reality the correlation of AUC0–∞ (green) with AUC0–t (blue) is higher than the correlation of both with Cmax (red). If we would test only the AUCs, the FWER would by given again by the intersection which is clearly lower than the individual type I errors. If we add Cmax, the FWER decreases further. Unfortunately the correlations are unknown. See this post for an example with two metrics and a reference how to deal with comparisons of multiple PK metrics. In PowerTOST two simultaneous tests are implemented. Say we have a CV of 0.25 for both, a sample size of 28, and the highest possible correlation of 1.
❝ ❝ Tmax is poor terminology. “T” denotes the absolute temperature in Kelvin whereas “t” stands for time. Hence, tmax should be used. ❝ I initially chose to use Tmax because I'm used to writing random variables in uppercase, … Yep, that’s fine. Unfortunately “Tmax” is used in some regulatory documents… When I review papers, I have a standard rant pointing that difference out followed by “regulations ≠ science”. ❝ ❝ The distribution strongly depends on the study’s sampling schedule but is definitely discrete. […] A log-transformation for discrete distributions is simply not allowed. ❝ ❝ Does it mean that the Nyquist criterion is not satisfied in pharmacokinetics (in terms of sampling interval)? Duno. Sorry. ❝ Couldn't we just design our study’s sampling schedule to ensure that Nyquist criterion is satisfied so that we can perfectly reconstruct the original continuous-time function from the samples? That’s wishful thinking. People tried a lot with D-optimal designs in PK. What I do is trying to set up a model and explore different sampling schemes based on the variance inflection factor. Regrettably, it rarely works. Modeling absorption is an art rather than science – especially with lag-times (delayed onset of absorption due to gastric emptying, gastric-resistant coatings, ). What if the model suggests to draw fifty blood samples (to “catch” Cmax/tmax in the majority of subjects) and you are limited to twenty? It’s always a compromise. ❝ […] I didn't dive in too deep to check if the moment-generating function of a log-normal distribution "matches" the (concentration-time profile?) model used in pharmacokinetics (because I'm still learning about the models used in pharmacokinetics), so it was more of a "visually looks a lot like a log-normal distribution" on my part :p Don’t fall into the trap of visual similarity. The fact that a concentration profile after an oral dose looks similar like a log-normal distribution is a mere coincidence. In a two-compartment model (right example: distribution is three times faster than elimination) the center of gravity is outside the profile; my “whizz wheel proof” would not work any more. Or even more extreme an intravenous dose… Of note the mean of residence times \(MRT=\frac{AUMC}{AUC}\) is nice because we can compare different models (say, a one- with a two-compartment model). Independent from the model after MRT ~⅔ of the drug are eliminated. For years I try to educate clinicians to abandon half lives (which are less informative) but old believes die hard (see there, slides 24–28). If you want to dive into the Kullblack-Leiber divergence note that any distributions can be compared. The fact that we log-transform AUC and Cmax in BE has three reasons:
Not sure what the current state of affairs are but in the past the Malaysian authority preferred the log10-transformation over loge. Not a particular good idea since PK is based on exponential functions. Furthermore, it makes our life miserable. The coefficient of variation based on the residual error of log-transformed data is given as \(CV=\sqrt{\textrm{e}^{MSE}-1}\). That’s the only one you find in textbooks and guidelines. If one used a log10-transformation the appropriate formula is \(CV=\sqrt{10^{\textrm{log}_{e}(10)\cdot MSE}-1}\). I have seen wrong sample size estimations where the former was used instead of the latter. ❝ ❝ Les Benet once stated (too lazy to search where and when) that for a reliable estimate of AUC one has to sample in such a way that the extrapolated fraction is 5–20%. For AUMC one would need 1–5%. ❝ ❝ I'm familiar with Taylor Series Approximations, Fourier series approximation, etc. but never really thought about how the error terms of a moment-generating function is controlled if I use, for example, a log-normal distribution to describe the concentration-time profile… Forget about that. ❝ […] I kinda prefer spending my time learning about IUT instead of thinking about this Good luck. Not that difficult. ❝ … so I was wondering if you have any good reference materials for learning about how to correctly use the moment-generating function to model a distribution in real-life? (e.g. concentration-time profile) (1) No idea and (2) the approach of seeing a PK profile as a distribution is pretty exotic. — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
victor ☆ Malaysia, 2019-11-18 21:16 (1785 d 17:03 ago) @ Helmut Posting: # 20824 Views: 7,719 |
|
Wow! Yet another awesome answer (and a big round of applause for solving the UTF-8 mystery)! Honestly, I feel like a caveman who has been gifted a powerful handphone (by you), equipped with GPS, google map, etc. to help me navigate the forest of pharmacokinetics, while I'm still looking at bushes (not even trees yet!) and using that handphone as a mirror Thanks for all the great pointers, keywords, and properties to look for :) Especially the idea behind IUT, because its my first time hearing that correlation had such a geometric interpretation! (。♥‿♥。) ❝ We have three tests. The areas give their type I errors. Since we perform all at the same level, the areas are identical. The Euclidean distance between centers give the correlation of PK metrics (here they are identical as well). The FWER is given by the area of the intersection which in any case will be ≤ the nominal α. ❝ ❝ In reality the correlation of AUC0–∞ (green) with AUC0–t (blue) is higher than the correlation of both with Cmax (red). If we would test only the AUCs, the FWER would by given again by the intersection which is clearly lower than the individual type I errors. If we add Cmax, the FWER decreases further. Now I'm even MORE looking forward to studying IUT in detail, because it reminds me of when I first learned that I could use SVD to geometrically visualize the error ellipsoid of a covariance matrix. I found it beautiful :) Thanks Helmut, for all your help (*ˊᗜˋ*)/ᵗᑋᵃᐢᵏ ᵞᵒᵘ* P.S. I'm spending time reading on IUT now, hence the short reply; but I thought I'd end with a nice little picture for memory :p |
Helmut ★★★ Vienna, Austria, 2019-11-19 13:01 (1785 d 01:18 ago) @ victor Posting: # 20827 Views: 7,583 |
|
Hi Victor, ❝ Honestly, I feel like a caveman […] C’mon! Your skills in maths are impressive. If you want to dive deeper into the matter:
The proof of the result is almost trivial, at least if one is willing to adopt some piece of the basic formalism customary in expositions of the abstract theory of statistical hypothesis testing methods. […] The condition we have to verify, reads […] as follows: ❝ I thought I'd end with a nice little picture for memory :p Nice picture! Do you know Anscombe’s quartet? — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
victor ☆ Malaysia, 2019-11-22 02:28 (1782 d 11:51 ago) @ Helmut Posting: # 20857 Views: 7,533 |
|
❝ ❝ Honestly, I feel like a caveman […] ❝ ❝ C’mon! Your skills in maths are impressive. Thanks Helmut I appreciate it a lot. (*^‿^*) I'm also grateful that you shared that excerpt (more on this later, because Chapter 7 wasn't available on books.google.com). Thankfully, I found enough free time to analyze the following IUT theorem using my notes when studying Theorem 8.3.4 from page 122 of this lecture notes. (I don't have access to any of your recommended textbooks yet though). P.S. I intentionally used < instead of ≤ because I feel that in practice, αY will always be less than one. Besides that, my Gaussian level sets didn't scale correctly against my hand-drawn background, so their shapes are compromised; they should actually have the same covariance matrix. I'm now thinking about all the cases when H0's covariance matrix is different from the true covariance matrix (such as how β would look like) to see how IUT really dealt with dependencies. But a preliminary glance seem to suggest that the above theorem is violated by the following cases, so I definitely need to think deeper on what the theorem actually states (i.e. what it means for IUT to be a level α test of H0 versus H1) to truly understand how the following cases affect the "global" α and β: With that said, now that I am starting to get a feel for IUT, I feel that I am getting closer to truly understand the following facts you shared: ❝
As of now though, I haven't managed to see the "new" geometric interpretation of correlation (yet), so the following facts are still not within my grasp: ❝ The FWER gets more conservative the more the PK metrics differ. ❝ The Euclidean distance between centers give the correlation of PK metrics (here they are identical as well). The FWER is given by the area of the intersection which in any case will be ≤ the nominal α. ❝ ❝ In reality the correlation of AUC0–∞ (green) with AUC0–t (blue) is higher than the correlation of both with Cmax (red). If we would test only the AUCs, the FWER would by given again by the intersection which is clearly lower than the individual type I errors. If we add Cmax, the FWER decreases further. ❝ ❝ The proof of the result is almost trivial, at least if one is willing to adopt some piece of the basic formalism customary in expositions of the abstract theory of statistical hypothesis testing methods. […] The condition we have to verify, reads […] as follows: ❝ $$E_{(\eta_1,\ldots,\eta_q)}(\phi)\leq\alpha\;\textrm{for all}\;(\eta_1,\ldots,\eta_q)\in H\tag{7.3}$$ where \(E_{(\eta_1,\ldots,\eta_q)}(\cdot)\) denotes the expected value computed under the parameter constellation \((\eta_1,\ldots,\eta_q)\). […] ❝ ❝ In order to apply the result to multisample equivalence testing problems, let \(\theta_j\) be the parameter of interest (e.g., the expected value) for the ith distribution under comparison, and require of a pair \((i,j)\) of distributions equivalent to each other that the statement $$K_{(i,j)}:\,\rho(\theta_i,\theta_j)<\epsilon,\tag{7.4}$$ holds true with \(\rho(\cdot,\cdot)\) denoting a suitable measure of distance between parameters. Suppose furthermore that for each \((i,j)\) a test \(\phi_{(i,j)}\) of \(H_{(i,j)}:\,\rho(\theta_i,\theta_j)\geq \epsilon\) versus \(K_{(i,j)}:\,\rho(\theta_i,\theta_j)< \epsilon\) is available whose rejection probability is \(\leq \alpha\) at any point \((\theta_1,\ldots,\theta_k)\) in the full parameter space such that \(\rho(\theta_i,\theta_j)\geq \epsilon\). Then, by the intersection-union principle, deciding in favour of “global quivalence” if and only if equivalence can be established for all \((_{2}^{k})\) possible pairs, yields a valid level-\(\alpha\) test for $$H:\,\underset{i<j}{\max}\{\rho(\theta_i,\theta_j)\}\geq \epsilon\;\textrm{vs.}\;K:\,\underset{i<j}{\max}\{\rho(\theta_i,\theta_j)\}<\epsilon\tag{7.5}$$ ❝ I'm thinking that the excerpt you shared contains the crucial info for me to see the "new" geometric interpretation of correlation, because of the following statement: ❝ […] with \(\rho(\cdot,\cdot)\) denoting a suitable measure of distance between parameters. but I'm very confused by the excerpt's notations, because I couldn't find their corresponding notations in the aforementioned lecture notes (nor by googling); in particular: 【・ヘ・?】
Thanks in advance for the clarification ❝ Do you know Anscombe’s quartet? Nope Thanks for sharing :) Because this is the first time I heard of Anscombe’s quartet, so I found it pretty interesting as a possible example to introduce other statistics (e.g. skew and kurtosis). For some reason though, my mind thought of the Raven Paradox when reading about Anscombe’s quartet. Maybe because they both raised the question on what actually constitutes as evidence for a hypothesis? |
victor ☆ Malaysia, 2019-11-23 10:05 (1781 d 04:15 ago) @ victor Posting: # 20863 Views: 7,374 |
|
❝ P.S. I intentionally used < instead of ≤ because I feel that in practice, αY will always be less than one. Note to self: Refer to first image below to see why we need to use ≤ ❝ […] But a preliminary glance seem to suggest that the above theorem is violated by the following cases, […] Update: More careful observation using a (correctly scaled) uniform sampling distribution for θ, instead of a bivariate normal sampling distribution for θ, reveals the aforementioned violation: |