Still not sure what you are aiming at… [General Sta­tis­tics]

posted by Helmut Homepage – Vienna, Austria, 2020-06-29 16:46 (114 d 01:21 ago) – Posting: # 21603
Views: 2,599

Hi ElMaestro,

» 1. Let us look at the wikipedia page for the t test:
» "Most test statistics have the form t = Z/s, where Z and s are functions of the data."

OK, so far.

» 2. For the t-distribution, here Z=sample mean - mean and s=sd/sqrt(n)

Wait a minute. You are referring to the one-sample t-test, right? At the [image] Assumptions we find$$t=\frac{Z}{s}=\frac{\bar{X}-\mu}{\hat{\sigma}/\sqrt{n}}$$That’s a little bit strange because WP continues with

\(\hat{\sigma}\) is the estimate of the standard deviation of the population

I beg your pardon? Most of my textbooks give the same formula but with \(s\) in the denominator as the sample standard deviation. Of course, \(s/\sqrt{n}\) is the standard error and sometimes we find \(t=\frac{\bar{X}-\mu}{\textrm{SE}}\) instead. Nevertheless, in [image] further down we find$$t=\frac{\bar{x}-\mu_0}{s/\sqrt{n}}$$THX a lot, soothing!

» 3. Why are Z and s independent in this case?

[image]Here we know the population mean. Hence, the numerator depends on the sample mean and the denominator on the sample’s standard error. They are independent indeed.
I added another plot to the code of this post.

A modified plot of 5,000 samples to the right.

» Or more generally, and for me much more importantly, if we have two functions (f and g, or Z and s), then which properties of such functions or their input would render them independent??
» Wikipedia links to a page about independence, key here is: […]

Yep.

» I am fully aware that when we simulate a normal dist. with some mean and some variance, then that defines their expected estimates in a sample. I.e. if a sample has a mean that is higher than the simulated mean, then that does not necessarily mean the sampled sd is higher (or lower, for that matter, that was where I was going with "perturbation"). It sounds right to think of the two as independent, in that case.

Correct. Anything is possible.

» Now, how about the general case, for example if we know nothing about the nature of the sample, but just look at any two functions of the sample? What property would we look for in those two functions to think they are independent?
» A general understanding of the idea of independence of any two quantities derived from a sample, that is what I am looking for; point #3 above defines my question.

Still not sure whether I understand you  correctly  at all. Think about the general formulation of a test statistic from above $$t=\frac{Z}{s},$$where \(Z\) and \(s\) are functions of the data.
I think that this formulation is unfortunate because it has neither to do anything with the standard normal distribution \(Z\) nor the sample standard deviation \(s\). For continuous variables I would prefer sumfink like$$test\;statistic=\frac{measure\;of\;location}{measure\;of\;dispersion}$$for clarity. If a test would be constructed in such a way that the independence is not correctly represented it would be a piece of shit.

Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Activity
 Admin contact
21,173 posts in 4,412 threads, 1,476 registered users;
online 5 (1 registered, 4 guests [including 2 identified bots]).
Forum time: Wednesday 18:07 CEST (Europe/Vienna)

But it is in matters beyond the limits of mere rule
that the skill of the analyst is evinced.
He makes in silence a host of observations and inferences…    Edgar Allan Poe

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5