Significant ≠ relevant [Design Issues]
Hi Felipe,
If he/she meant by “appear” to be “statistically significant”: False. Dive into your database of studies (performed in one group), arbitrarily code the first half of subjects with group=1 and the second with group=2. Run a model including a group term. If you set the significance limit to 0.05, I bet that you will see a significant “group effect” in ~1/20 of studies – although we know that the data originate from one group. That’s called “false positive” or in this particular case a statistical artifact.
I don’t like to test for effects which are either irrelevant or have no consequences. If a study was performed in two groups and we assess the p-values, there are three possible results:
Hence, we try to keep groups as similar as possible by design. I have stated above conditions where the FDA accepts no group-term in the model. I don’t like that “experts” browse the internet for SAS-code and apply it without thinking about the consequences. I got the impression from Smitha’s post that they use a group term routinely – which will be significant at the level of the test by pure chance. Therefore, I asked why…
Edit: Just saw ElMaestro’s post above. What really would worry me is a p >0.05 of the subject effect. Could give a hint that subjects were a bunch of monozygotic quadruplets. The model requires that subjects are independent… Don’t overdo standardization.
❝ […] the biostat said that this effect could appear when you cannot give the treatments to whole sample size at the same time/day. (statement 1)
If he/she meant by “appear” to be “statistically significant”: False. Dive into your database of studies (performed in one group), arbitrarily code the first half of subjects with group=1 and the second with group=2. Run a model including a group term. If you set the significance limit to 0.05, I bet that you will see a significant “group effect” in ~1/20 of studies – although we know that the data originate from one group. That’s called “false positive” or in this particular case a statistical artifact.
I don’t like to test for effects which are either irrelevant or have no consequences. If a study was performed in two groups and we assess the p-values, there are three possible results:
- p ≤0.05: Groups differ. We should not pool them. Hopefully groups were not split 1:1, but one of them was the maximum capacity of the clinical site. Example: Based on a CV of 27% the sample size was estimated as 32. The capacity of the site is 24. If you split groups 16:16, power drops from 80.4% to 41.4%. What if you are lucky and show BE in one group, but not in the other? Do you think that regulators would believe the results of the “nice group” and ignore the other (failed) one? On the other hand, no regulator would ask you questions if you have unequal group sizes and base the decision on the larger group. If the split is 24:8, power in the larger group will still be 66.7%. Not sooo bad. Like rolling a die and bet on even/odd.
- p ≤0.05: Groups don’t differ, but the result is a false positive. Bad luck. All the nasty stuff from above is applicable.
- p >0.05: Groups don’t differ. Happy pooling.
Hence, we try to keep groups as similar as possible by design. I have stated above conditions where the FDA accepts no group-term in the model. I don’t like that “experts” browse the internet for SAS-code and apply it without thinking about the consequences. I got the impression from Smitha’s post that they use a group term routinely – which will be significant at the level of the test by pure chance. Therefore, I asked why…
❝ However group effect is not a big deal such as sequence, treatment or period effect. (statement 2)
- Sequence (actually unequal carryover): False. Since it cannot be properly handled in a 2×2 crossover (Freeman showed that 25 years ago) it should be avoided by design. Therefore, the EMA’s GL specifically states that this effect should not be tested and carryover avoided by a sufficiently long washout.
- Treatment: False. Even for a very small difference with a high sample size size you will see a significant result because power increases (see this post). Is it relevant (“big deal”? Not at all. In most regulations the minimum sample size is 12. Therefore, you see a significant effect regularly for CVs ≤10%. In Brazil (min. 24) you will see it every other day. I guess you have a standard sentence to discuss that in the report.
- Period: False. In a crossover the model will care for it. Both T and R will be ~equally affected (unless the study is extremely imbalanced). Try it: Take the data of any study and multiply all values of the second period by 10. Does the CI of the PE change?
Edit: Just saw ElMaestro’s post above. What really would worry me is a p >0.05 of the subject effect. Could give a hint that subjects were a bunch of monozygotic quadruplets. The model requires that subjects are independent… Don’t overdo standardization.
—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!![[image]](https://static.bebac.at/pics/Blue_and_yellow_ribbon_UA.png)
Helmut Schütz
![[image]](https://static.bebac.at/img/CC by.png)
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
![[image]](https://static.bebac.at/pics/Blue_and_yellow_ribbon_UA.png)
Helmut Schütz
![[image]](https://static.bebac.at/img/CC by.png)
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- Study conduct in groups Smitha 2015-02-02 04:21 [Design Issues]
- Study conduct in groups ElMaestro 2015-02-02 08:21
- Study conduct in groups Helmut 2015-02-02 13:08
- Study conduct in groups Smitha 2015-02-04 04:30
- Study conduct in groups Helmut 2015-02-04 12:57
- Study conduct in groups felipeberlinski 2015-02-04 22:36
- Study conduct in groups ElMaestro 2015-02-04 23:58
- Significant ≠ relevantHelmut 2015-02-05 00:49
- Significant ≠ relevant Astea 2016-03-24 20:10
- Significant ≠ relevant ElMaestro 2016-03-24 23:12
- Significant ≠ relevant zizou 2016-03-25 21:41
- Loss of power etc. Helmut 2016-03-26 14:46
- Loss of power etc. Astea 2016-03-27 21:18
- Loss of power etc. zizou 2016-03-27 23:44
- Combined power? Helmut 2016-03-28 14:29
- Loss of power etc. Astea 2016-03-28 23:57
- Loss of power etc. ElMaestro 2016-03-29 00:16
- Mystery Helmut 2016-03-29 17:28
- Back to the Future Astea 2016-03-29 21:57
- Back to the Future ElMaestro 2016-03-29 23:11
- Using lectures != Reading them mittyri 2016-03-30 00:17
- Back to the Future ElMaestro 2016-03-29 23:11
- Back to the Future Astea 2016-03-29 21:57
- Loss of power etc. Helmut 2016-03-26 14:46
- Significant ≠ relevant Astea 2016-03-24 20:10
- Study conduct in groups felipeberlinski 2015-02-04 22:36
- Study conduct in groups Helmut 2015-02-04 12:57
- Study conduct in groups Smitha 2015-02-04 04:30