Simulations 101 [Power / Sample Size]

posted by Helmut Homepage – Vienna, Austria, 2020-05-27 01:47 (1514 d 22:35 ago) – Posting: # 21483
Views: 4,277

Dear all,

off-list I was asked about some discrepancies between sample sizes estimated by PowerTOST’s functions sampleN.scABEL(), sampleN.RSABE() and the tables of the Two Lászlós1 (:waving: Charlie).

Excursion: An introduction to Monte Carlo Simulations.

With a reasonably large number of simulations the empiric value will converge to the true value. How fast a sufficiently accurate result is obtained, depends on the problem. Hence, it is a good idea to assess that before you set the number of simulations. See IV and V below why we opted for 100,000 as the default in PowerTOST.

  1. Rolling two dice
    Here it’s easy to get the average sum of pips. If you remember probability and combinatorics, you will get the answer 7. If you are a gambler, you simply know it. If you have two handy, roll them. Of course, you will get anything between the possible extremes (2 and 12) in a single roll. But you will need many rolls to come close to the average 7. Since this is a boring excercise (unless you find an idiot betting against 7), let [image] do the job:


  2. Estimating π
    We generate a set of uniformly distributed coordinates {x|y} in the range {–1, +1}. According to Mr Pythagoras all x2 + y2 ≤ 1 are at or within the unit circle and coded in blue. Since \(A = \frac{\pi \, d^2}{4}\) and \(d = 1\), we get an estimate of π by 4 times the number of coded points (the “area”) divided by the number of simulations. Here for 50,000:


    This was a lucky punch since the convergence is lousy. To get a stable estimate with just four significant digits, we need at least 100 million (‼) simulations. Not very efficient.

  3. Type I Error of TOST
    We know that α (probability of the Type I Error) never exceeds the nominal level of the test (generally 0.05). In e.g., a 2×2×2 crossover with CV 20% and 20 subjects it is 0.049999895… We can simulate studies with a GMR at one of the borders of the acceptance range and assess them for ABE as well. We expect ≤5% to pass. The number of passing studies divided by the number of simulations gives the empiric Type I Error. I used the function power.TOST.sim().


    The filled circles are runs with a fixed seed of 123456 and open circles ones with random seeds.
    We see that convergence of this problem is bad. Therefore, e.g., in Two-Stage Designs 106 simulations are common.

  4. Sample size for ABEL (EMA)
    Table A21 gives for a 4-period full replicate study (EMA method), GMR 0.85, CV 50%, and ≥80% power a sample size of 54, whilst sampleN.scABEL() gives only 52.


    Five runs. As before filled circles are with a fixed seed (default) and open circles with random seeds.
    With sampleN.scABEL() and only 10,000 simulations I got* five times 52 and once 54 (like The Lászlós). If you have more runs, you will get in some of them more extreme ones (in 1,000 runs of 10,000 I got x̃ 52 and a range of 48–56). However, this range shows that the result is not stable. Hence, we need more simulations to end up with 52 as a stable result. Although you are free to opt for more than 100,000 it’s a waste of time.

  5. Sample size for RSABE (FDA)
    Table A41 gives for a 4-period full replicate study (FDA method), GMR 0.90, CV 70%, and ≥90% power a sample size of only 46, whilst sampleN.RSABE() gives 48.


    Filled circles are with a fixed seed (default) and open circles with random seeds.
    With sampleN.RSABE() and 10,000 simulations I got* once 46 (like The Lászlós), once 48, and thrice 50. In 1,000 runs of 10,000 I got x̃ 48 and a range of 44–52. With more simulations we end up with 48 as a stable result.

  1. Tóthfalusi L, Endrényi L. Sample Sizes for Designing Bioequivalence Studies for Highly Variable Drugs. J Pharm Pharmaceut Sci. 2011;15(1):73–84. [image] Open access.
  2. Zheng C, Wang J, Zhao L. Testing bioequivalence for multiple formulations with power and sample size calculations. Pharmaceut. Stat. 2012;11(4):334–341. doi:10.1002/pst.1522.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
 Admin contact
23,112 posts in 4,858 threads, 1,644 registered users;
90 visitors (0 registered, 90 guests [including 13 identified bots]).
Forum time: 00:23 CEST (Europe/Vienna)

Skill is a function of chance.
It’s an intuitive best-use of chance situations.    Philip K. Dick

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz