Helmut
★★★

Vienna, Austria,
2006-10-22 17:38
(6189 d 01:41 ago)

Posting: # 324
Views: 6,038

## Pilot Study once again... [Power / Sample Size]

Dear all,

since at the Regulatory Workshop on BE/Dissolution in Budapest there was a debate concerning the appropriate size of a pilot study (some participants advocated still a sample size of 6 subjects), I prepared a little example.
You may use the data to 'play around' in your own piece of software. Although raw data are given, all calculations were obtained after log-transformation (multiplicative model).

Let's start with a balanced 2×2 cross-over in 24 subjects (the main study) and extract 4 subsets of size 6 and 2 subsets of size 12 (the artificial pilot studies). The CV was 20%, which is quite common in BE. Please note that such a procedure is for illustrative purpose only.

In the following some results are presented; PE is the point estimate Test/Reference, 90% CI it's confidence interval, CVintra the intra- (or within) subject coefficient of variation, CL of CV is the (upper) one-sided 95% confidence limit of CVintra (for details see this post), and N is the calculated sample size for a main study with 80% power and expected PE of 95% (-5% deviation from the reference). Also note that the expected deviation should never be set to something close to unity (even if you have found an PE of 1 in you pilot study). If you have found a larger deviation, use at least this deviation in sample size calculations.

The main study: =============== PE 96.4% (90% CI: 87.5%-106.5%) CVintra:  20.00%  (N:  18) CL of CV: 26.91%  (N:  32)

Bioequivalence was demonstrated in the main study (N: 24), but we may also notice two points:
• 18 subjects may have been sufficient,
• but if we want to repeat the study (e.g., with another formulation) we should be cautious, because 24 < 32. The variability of 20% is not a natural constant

The subsets: ============ Subsets of size 6: ------------------ Set 1 (#1-#6) PE 91.1% (90% CI: 77.7%-107.3%) CVintra:  13.15%  (N:  10) CL of CV: 31.82%  (N:  44) Set 2 (#7-#12) PE 101.7% (90% CI: 77.8%-135.2%) CVintra:  22.74%  (N:  24) CL of CV: 57.28%  (N: 140) Set 3 (#13-#18) PE 96.1% (90% CI: 78.2%-119.4%) CVintra:  17.32%  (N:  14) CL of CV: 42.53%  (N:  78) Set 4 (#19-#24) PE 94.6% (90% CI: 66.8%-137.7%) CVintra:  30.02%  (N:  40) CL of CV: 79.07%  (N: 264) Subsets of size 12: ------------------- Set 1 (#1-#12) PE 96.5% (90% CI: 83.9%-111.6%) CVintra:  19.47%  (N:  18) CL of CV: 31.47%  (N:  44) Set 2 (#13-#24) PE 95.6% (90% CI: 81.9%-113.2%) CVintra:  22.14%  (N:  22) CL of CV: 35.93%  (N:  56)

Now what can we learn from the examples?

Just consider subset 3 of size 6. The PE was 96.1%, and the CV 17.32%; if we calculate the sample size with -5%, we would perform the main study in 14 subjects which would have a very high probability of failure. Ignoring the uncertainty in PE (and to a much greater extent) in CV is simply not a good idea.

Or have a look at subset 4 of size 6: even using the calculated CV of 30.02, we would plan the main study in 40 subjects (which very likely will be overpowered); or even worse, if we want to be cautious the upper CL of 79.07% (264 subjects!) would lead to the wrong conclusion, that we have to deal with a highly variable drug, and subsequent unnecessary complicated design issues (e.g., replicate design with scaled average BE).

Both subsets of size 12 lead to more consistent results. If you have stated such a procedure in your protocol, even BE may be claimed in both subsets, and no further study is necessary. If you want to use the upper CL in sample size estimation, you also get more consistant values.

If you have some hints of high intra-subject variability (>30%), I would recommend a pilot study size of at least 16 subjects.

Bottom line:
Small sample sizes (let's say <12) are useful in checking the sampling schedule and the appropriateness of the analytical method, but are not reasonable for the purpose of sample size planning. Q.E.D.

Some regulators tend to have a biased perspective, because they see only examples working with n=6. Unfortunately luck is not a statistical category.
On the other hand I have seen many failed studies following small pilots; at least the failed study was large enough to get a better estimate for a repeated study
According to the implemented European GCP-directive I have seen IECs rejecting approval of small pilot studies 'because the size of the study is too limited to get a reliable estimate of variability'.
I know one of the generic 'global players' setting the internal rule to a minimum size of 16.

Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes