chrisk
☆

2022-08-09 20:55
(564 d 20:21 ago)

Posting: # 23209
Views: 5,150

## Sample size for 3-way crossover [Power / Sample Size]

Dear all,

Thank you for those who created the wonderful PowerTOST package and who are sharing their knowledge in this forum.

My question relates to the sample size calculation for 3-way crossover (e.g. 3x6x3). I want to demonstrate that both test drugs T1 and T2 are BE to the reference R. The althernative hypothesis is: T1 = R and T2 = R.

When performing the sample size calculation using PowerTOST for 3-way crossover, which alternative is considered? Since there are several ways to formulate the alternatives, I wasn't sure whether the current settings of PowerTOST correspond to the one that is of my interest. If it doesn't, what is your recommendation to deal with my scenario of interest? Any reference will also be very much appreciated.

Another question is related to the number of PK parameters included. Does the sample size calculation consider the number of PK parameters for the BE?

Thank you for your help on this.

Helmut
★★★

Vienna, Austria,
2022-08-10 13:30
(564 d 03:46 ago)

@ chrisk
Posting: # 23210
Views: 4,898

## Sample size for 3-way crossover

Hi Chris,

❝ Thank you for those who created the wonderful PowerTOST package and who are sharing their knowledge in this forum.

Welcome.

❝ […] 3-way crossover (e.g. 3x6x3). I want to demonstrate that both test drugs T1 and T2 are BE to the reference R. The althernative hypothesis is: T1 = R and T2 = R.

❝ When performing the sample size calculation using PowerTOST for 3-way crossover, which alternative is considered? Since there are several ways to formulate the alternatives, I wasn't sure whether the current settings of PowerTOST correspond to the one that is of my interest. If it doesn't, what is your recommendation to deal with my scenario of interest?

Use the script mentioned in the previous post.

❝ Another question is related to the number of PK parameters included. Does the sample size calculation consider the number of PK parameters for the BE?

You have to demonstrate BE for all PK metrics. Therefore, it is a little bit more difficult to show BE for the FDA and China’s CDE (Cmax, AUC0–t, AUC0–) than in other jurisdictions, where AUC0– is not required. Since the CV of Cmax is generally larger than the one of AUC, no problem. If you pass BE of Cmax, likely you will pass AUC as well.
You don’t have be worried about multiplicity issues because the Type I Error is controlled by the Intersection-Union Tests (IUT).* Therefore, you don’t need to adjust the α-level of the tests, i.e., the common 90% CI is just fine. Since you want to demonstrate equivalence of both test treatments to the reference, the same logic applies here. Hence, base the sample size estimation on the worst case, i.e., the PK metric of T1 or T2 where you expect the largest deviation from R and/or which shows the largest CV.
For the ‘Two at Time’ approach use the argument bal = TRUE:

library(PowerTOST) target <- 0.80 # target power alpha  <- 0.05 # no adjustment for IUT x      <- data.frame(treatment = rep(c("A", "B"), each = 2),                      metric    = c("Cmax", "AUC"),                      theta0    = c(0.94, 0.95,                                    0.96, 0.97),                      CV        = c(0.25, 0.20,                                    0.23, 0.18),                      n         = NA_integer_, power = NA_real_) for (j in 1:nrow(x)) { # preliminary sample sizes for both treatments and metrics   x$n[j] <- sampleN.TOST(alpha = alpha, CV = x$CV[j], theta0 = x$theta0[j], targetpower = target, print = FALSE)[["Sample size"]] } CV <- x$CV[x$n == max(x$n)]     # extract the theta0 <- x$theta0[x$n == max(x$n)] # worst case y <- make.ibds(alpha = alpha, CV = CV, theta0 = theta0, ntmt = 3, ref = "C", sep = "–", bal = TRUE, details = FALSE, print = FALSE) x$n    <- y$n # replace preliminary sample sizes with final one for (j in 1:nrow(x)) { x$power[j] <- signif(power.TOST(alpha = alpha, CV = x$CV[j], theta0 = x$theta0[j], n = y$n), 4) } print(y$rand); cat(y$txt, "\n\n"); print(x, row.names = FALSE, right = FALSE) subject seqno sequence IBD 1 IBD 2 1 1 4 BCA –CA BC– 2 2 5 CAB CA– C–B 3 3 6 CBA C–A CB– 4 4 2 ACB AC– –CB 5 5 2 ACB AC– –CB 6 6 5 CAB CA– C–B 7 7 3 BAC –AC B–C 8 8 3 BAC –AC B–C 9 9 6 CBA C–A CB– 10 10 1 ABC A–C –BC 11 11 4 BCA –CA BC– 12 12 1 ABC A–C –BC 13 13 1 ABC A–C –BC 14 14 6 CBA C–A CB– 15 15 4 BCA –CA BC– 16 16 1 ABC A–C –BC 17 17 5 CAB CA– C–B 18 18 3 BAC –AC B–C 19 19 4 BCA –CA BC– 20 20 3 BAC –AC B–C 21 21 6 CBA C–A CB– 22 22 5 CAB CA– C–B 23 23 2 ACB AC– –CB 24 24 2 ACB AC– –CB 25 25 4 BCA –CA BC– 26 26 3 BAC –AC B–C 27 27 3 BAC –AC B–C 28 28 1 ABC A–C –BC 29 29 1 ABC A–C –BC 30 30 6 CBA C–A CB– 31 31 2 ACB AC– –CB 32 32 2 ACB AC– –CB 33 33 5 CAB CA– C–B 34 34 5 CAB CA– C–B 35 35 4 BCA –CA BC– 36 36 6 CBA C–A CB– Reference : C Tests : A, B Sequences : ABC, ACB, BAC, BCA, CAB, CBA Subjects per sequence : 6 | 6 | 6 | 6 | 6 | 6 (balanced) Estimated sample size : 32 Achieved power : 0.8180 Adjustment to obtain period-balance of IBDs Adjusted sample size : 36 Achieved power : 0.8587 Randomized : 2022-08-10 11:29:39 CEST Seed : 1823948 treatment metric theta0 CV n power A Cmax 0.94 0.25 36 0.8587 A AUC 0.95 0.20 36 0.9751 B Cmax 0.96 0.23 36 0.9541 B AUC 0.97 0.18 36 0.9977 Sorry, you have to live with A, B, and C for your T1, T2, and R. Maybe I will modify the script later. No promises. An ugly quick-shot at the end. For period-balance of the IBDs we have to round 32 up to to next multiple of 6. Hence, we gain power for the worst case metric. Of course, power for the other metrics is pretty high. At least for those you have some room to navigate and are protected against surprises. Only if you would have an OR-conjunction (you are happy that either of the tests passes), you would have to use alpha <- 0.025 and assess the study with 95% CIs because you get two chances (see there). In the example you would need 42 subjects. • Berger RL, Hsu JC. Bioequivalence Trials, Intersection-Union Tests and Equivalence Confidence Sets. Stat Sci. 1996; 11(4): 283–302. JSTOR:2246021. Open access. library(PowerTOST) library(randomizeBE) make.equal <- function(n, ns) { return(as.integer(ns * (n %/% ns + as.logical(n %% ns)))) } T <- c("T1", "T2") R <- c("R") alpha <- 0.05 theta0 <- 0.94 CV <- 0.25 target <- 0.80 sep <- "–" n <- sampleN.TOST(alpha = alpha, CV = CV, theta0 = theta0, targetpower = target, print = FALSE)[["Sample size"]] seqs <- williams(ntmt = 3) n <- make.equal(n, length(seqs)) seqs <- gsub("", "\\1 \\2", seqs) repeat { rand <- RL4(nsubj = n, seqs = seqs, randctrl = FALSE)$rl   trts <- sub("[[:blank:]]+$", "", sort( unique( unlist( strsplit(Reduce(function(x, y) paste0(x, y), seqs), ""))))) trts <- trts[nzchar(trts)] if (sum(c(nchar(T), nchar(R))) > length(trts) * 2) stop("None of the treatments must be coded with more than two characters.") s <- Reduce(function(x, y) paste0(x, y), trts) refs <- substr(s, nchar(s), nchar(s)) tests <- trts[!trts %in% refs] n.ibd <- length(tests) * length(refs) for (j in 1:n.ibd) { rand[[paste0("IBD.", j)]] <- NA } for (j in 1:nrow(rand)) { c <- 3 for (k in seq_along(refs)) { for (m in seq_along(tests)) { c <- c + 1 excl <- tests[!tests == tests[m]] excl <- c(excl, refs[!refs == refs[k]]) excl <- paste0("[", paste(excl, collapse = ", "), "]") rand[j, c] <- gsub(excl, sep, rand$sequence[j])         rand[j, c] <- gsub(tests[m], T[m], rand[j, c])         rand[j, c] <- gsub(refs[k], R[k], rand[j, c])         rand[j, c] <- gsub("   ", "  ", rand[j, c])       }     }   }   for (j in seq_along(refs)) {     rand$sequence <- gsub(refs[j], R[j], rand$sequence)   }   for (j in seq_along(tests)) {     rand$sequence <- gsub(tests[j], T[j], rand$sequence)   }   checks  <- NA   ibd.seq <- as.data.frame(                matrix(data = NA,                       nrow = n.ibd,                       ncol = 4, byrow = TRUE,                       dimnames = list(names(rand)[4:ncol(rand)],                                       paste0(rep(c("seq.", "n."), 2),                                              rep(1:2, each = 2)))))   for (j in 1:n.ibd) {     ibd                 <- gsub("[^[:alnum:], ]", "", rand[3 + j])     ibd                 <- gsub("c", "", ibd)     ibd                 <- gsub(" ", "", unlist(strsplit(ibd, ",")))     ibd.seq[j, c(1, 3)] <- sort(unique(ibd))     ibd.seq[j, c(2, 4)] <- c(length(ibd[ibd == sort(unique(ibd))[1]]),                              length(ibd[ibd == sort(unique(ibd))[2]]))     checks[j]           <- ibd.seq[j, 2] == ibd.seq[j, 4]   }   if (sum(checks) == length(trts) - 1) break } print(rand, row.names = FALSE)  subject seqno  sequence    IBD.1    IBD.2        1     2  T1 R T2   T1 R –   – R T2        2     6  R T2 T1   R – T1   R T2 –        3     1  T1 T2 R   T1 – R   – T2 R        4     2  T1 R T2   T1 R –   – R T2        5     4  T2 R T1   – R T1   T2 R –        6     5  R T1 T2   R T1 –   R – T2        7     5  R T1 T2   R T1 –   R – T2        8     1  T1 T2 R   T1 – R   – T2 R        9     3  T2 T1 R   – T1 R   T2 – R       10     6  R T2 T1   R – T1   R T2 –       11     3  T2 T1 R   – T1 R   T2 – R       12     4  T2 R T1   – R T1   T2 R –       13     2  T1 R T2   T1 R –   – R T2       14     4  T2 R T1   – R T1   T2 R –       15     1  T1 T2 R   T1 – R   – T2 R       16     6  R T2 T1   R – T1   R T2 –       17     1  T1 T2 R   T1 – R   – T2 R       18     2  T1 R T2   T1 R –   – R T2       19     5  R T1 T2   R T1 –   R – T2       20     4  T2 R T1   – R T1   T2 R –       21     3  T2 T1 R   – T1 R   T2 – R       22     3  T2 T1 R   – T1 R   T2 – R       23     5  R T1 T2   R T1 –   R – T2       24     6  R T2 T1   R – T1   R T2 –       25     6  R T2 T1   R – T1   R T2 –       26     2  T1 R T2   T1 R –   – R T2       27     4  T2 R T1   – R T1   T2 R –       28     1  T1 T2 R   T1 – R   – T2 R       29     1  T1 T2 R   T1 – R   – T2 R       30     5  R T1 T2   R T1 –   R – T2       31     2  T1 R T2   T1 R –   – R T2       32     5  R T1 T2   R T1 –   R – T2       33     3  T2 T1 R   – T1 R   T2 – R       34     4  T2 R T1   – R T1   T2 R –       35     3  T2 T1 R   – T1 R   T2 – R       36     6  R T2 T1   R – T1   R T2 –

Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
chrisk
☆

2022-08-10 17:24
(563 d 23:52 ago)

@ Helmut
Posting: # 23211
Views: 4,618

## Sample size for 3-way crossover

Thanks again, Helmut, for sharing your wisdom so quickly and in details.

Since the alternative is based on T1 = R OR T2 = R, I am attempted to think that a larger sample size is required for the alternative T1 = R AND T2 = R. Is that indeed the case? If so, would you have any recommendation on how to approach the sample size estimation?

Thanks again, I very much appreciate your help!
Helmut
★★★

Vienna, Austria,
2022-08-10 17:58
(563 d 23:18 ago)

@ chrisk
Posting: # 23212
Views: 4,612

## AND or OR, that’s the question

Hi Chris,

❝ Thanks again, Helmut, for sharing your wisdom so quickly and in details.

When you know something, say what you know.
When you don’t know something, say that you don’t know.
That is knowledge.
Confucius

I know much less about anything than I know about something. Wisdom is not my thing.

❝ Since the alternative is based on T1 = R OR T2 = R, …

Are you trying to confuse me? In your OP you stated:

❝ ❝ ❝ I want to demonstrate that both test drugs T1 and T2 are BE to the reference R.

That’s an AND-conjunction, right?

❝ … I am attempted to think that a larger sample size is required for the alternative T1 = R AND T2 = R. Is that indeed the case?

Correct! In an OR-conjunction in your case you get at least two chances (either T1 or T2 passes or both). In the most simple case* you have to use Bonferroni’s adjustment
$$\alpha_\text{adj}=\alpha/k\small{,}\tag{1}$$ where $$\alpha$$ is the nominal level of the test and $$\small{k}$$ the number of tests. Then the family-wise error rate is controlled with
$$FWER=1-\left(1-\alpha_\text{adj}\right)^k\tag{2}$$
You have to adjust only for T1 | T2 = R (two tests).

❝ If so, would you have any recommendation on how to approach the sample size estimation?

See the last paragraph of my previous post. Call the script with alpha = 0.025.

Reference                : C Tests                    : A, B Sequences                : ABC, ACB, BAC, BCA, CAB, CBA Subjects per sequence    : 7 | 7 | 7 | 7 | 7 | 7 (balanced) Estimated sample size    : 40 Achieved power           : 0.8159 Adjustment to obtain period-balance of IBDs  Adjusted sample size    : 42  Achieved power          : 0.8353 Randomized               : 2022-08-10 15:56:23 CEST Seed                     : 9874408  treatment metric theta0 CV   n  power  A         Cmax   0.94   0.25 42 0.8353  A         AUC    0.95   0.20 42 0.9731  B         Cmax   0.96   0.23 42 0.9489  B         AUC    0.97   0.18 42 0.9980

• There are more powerful approaches (e.g., Bonferroni-Holm, hierarchical testing,…) but the chances are close to nil that assessors dealing with BE know/understand them.

Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
chrisk
☆

2022-08-10 19:46
(563 d 21:30 ago)

@ Helmut
Posting: # 23213
Views: 4,584

## AND or OR, that’s the question

Thank you Helmut. My latest question was due to my imprudence. My alternative of interst is the one with "AND", indeed.

But this exercise was still helpful since I naively expected the "OR" alternative to result in a smaller sample size compared with the "AND" alternative. I failed to realise that a type I error comes into play for the "OR" alternative, which explains a larger sample size (n=42 vs n=36).

Thank you also for modifying the R program. I sincerely appreciate your time and efforts.

And lastly, thank you for correcting me on the difference between wisdom and knowledge Thank you for sharing your knowledge!

Have a wonderful day!
Helmut
★★★

Vienna, Austria,
2022-08-10 22:59
(563 d 18:17 ago)

@ chrisk
Posting: # 23214
Views: 4,587

## AND or OR, that’s the question

Hi Chris,

❝ But this exercise was still helpful since I naively expected the "OR" alternative to result in a smaller sample size compared with the "AND" alternative. I failed to realise that a type I error comes into play for the "OR" alternative, which explains a larger sample size (n=42 vs n=36).

In my workshops I use to give this example:
1. You toss a single ideal coin once and place a bet on heads (or tails, doesn’t matter). What’s the chance to win? Obviously ½.
2. You toss it a second time and place the same bet as before. The chance to win this bet is again ½ (coins have a short memory span).
What is the chance to win at least one of the bets?

❝ Thank you also for modifying the R program. I sincerely appreciate your time and efforts.

Welcome. I would suggest to use the original one because it is more flexible and shows the seed for reproducibility / documentation.

Answers based on the cumulative probability of the binomial distribution (hint: Bernoulli process). In :
1. pbinom(0, 1, 1/2) [1] 0.5
2. pbinom(1, 2, 1/2) [1] 0.75
Trivial: The more often you try, the more likely is the chance to win at least one of the bets. Therefore, coming back to BE: In the OR-conjunction you have to adjust the level of the test.

Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
chrisk
☆

2022-08-11 00:00
(563 d 17:15 ago)

@ Helmut
Posting: # 23215
Views: 4,536

## AND or OR, that’s the question

Hi Helmut,
Thanks again for this explanation.

After some thoughts, I am wondering if the sample size estimation only considers one comparison. I found an old following post regarding this topic.

If we are considering both T1=R and T2=R, perhaps, we need to look at the bivariate distribution of the two test statistics when looking at the power. And these should be correlated as well, which makes things complicated. I am maybe missing something and was wondering if you could enlighten me on this subject.

Thank you sincerely!
Helmut
★★★

Vienna, Austria,
2022-08-11 17:25
(562 d 23:51 ago)

@ chrisk
Posting: # 23216
Views: 4,511

## AND or OR, that’s the question

Hi Chris,

❝ After some thoughts, I am wondering if the sample size estimation only considers one comparison. I found an old following post regarding this topic.

Oh dear, that one! Opening Pandora’s box. However, only relevant if one would assess the study with the ‘All at Once’ approach (i.e., an ANOVA of pooled data). The jury left the courtroom 2½ years ago and hasn’t returned ever since…

❝ If we are considering both T1=R and T2=R, perhaps, we need to look at the bivariate distribution of the two test statistics when looking at the power. And these should be correlated as well, which makes things complicated. I am maybe missing something and was wondering if you could enlighten me on this subject.

In the ‘Two at a Time’ approach T1 and T2 are treated independently. Hence, you have two separate distributions. You see that also in the evaluation, where you obtain two residual variance estimates.

Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
chrisk
☆

2022-08-12 18:02
(561 d 23:13 ago)

@ Helmut
Posting: # 23221
Views: 4,505

## AND or OR, that’s the question

Thanks again for your help on this, Helmut.

❝ Oh dear, that one! Opening Pandora’s box. However, only relevant if one would assess the study with the ‘All at Once’ approach (i.e., an ANOVA of pooled data). The jury left the courtroom 2½ years ago and hasn’t returned ever since…

And my apologies for opening the Pandora's box! I will pretend that I never wrote this

❝ In the ‘Two at a Time’ approach T1 and T2 are treated independently. Hence, you have two separate distributions. You see that also in the evaluation, where you obtain two residual variance estimates.

Here, I still am a bit confused. Since my alternative is T1=R AND T2=R, I need to carry out 2 tests. Both have to be rejected at a given significance (without adjustment, say 0.05) and jointly, these 2 tests have to achieve a given power (say 0.8). Here is where I am confused:

1) The 2 tests must jointly have a power > 0.8. For illustration, if we assume independence of these 2 tests (which is clearly not the case but for the sake of illustration), we want to achieve the following:
power(test1, test2) = power(test1) x power(test2) = 0.8

If we further assume equal power for test 1 and test 2, then we need sqrt(0.8) = 0.9 for each test. When computing the sample size in PowerTOST, I am wondering whether the power argument is for the joint tests, or for individual test. If it's specified at an individual level, the joint power is 0.8 x 0.8 = 0.64.

2) I understand that T1 and T2 are treated independently. But are they assumed to be independent from R, as well?
Even if T1, T2 and R are pairwise independent, R appears in both test statistics so the two tests will show correlation. Does this correlation taken into account when computing the sample size?

In my naive opinion, I feel that, under the alternative which stipulates BE, the PK parameters should be correlated between treatments and this correlation should be taken into account.

Helmut
★★★

Vienna, Austria,
2022-08-17 17:04
(557 d 00:12 ago)

@ chrisk
Posting: # 23224
Views: 4,272

## The unknown ρ

Hi Chris,

❝ ❝ Oh dear, that one! Opening Pandora’s box.

❝ And my apologies for opening the Pandora's box! I will pretend that I never wrote this

❝ Here, I still am a bit confused. Since my alternative is T1=R AND T2=R, I need to carry out 2 tests. Both have to be rejected at a given significance (without adjustment, say 0.05) and jointly, these 2 tests have to achieve a given power (say 0.8). Here is where I am confused:

❝ 1) The 2 tests must jointly have a power > 0.8. For illustration, if we assume independence of these 2 tests (which is clearly not the case but for the sake of illustration), we want to achieve the following:

❝ power(test1, test2) = power(test1) x power(test2) = 0.8

❝ If we further assume equal power for test 1 and test 2, then we need sqrt(0.8) = 0.9 for each test. When computing the sample size in PowerTOST, I am wondering whether the power argument is for the joint tests, or for individual test. If it's specified at an individual level, the joint power is 0.8 x 0.8 = 0.64.

That’s a fallacy. I fell into this trap myself. See post dealing with overall power of multiple studies.

❝ 2) I understand that T1 and T2 are treated independently. But are they assumed to be independent from R, as well?

❝ Even if T1, T2 and R are pairwise independent, R appears in both test statistics so the two tests will show correlation.

Some, yes. To which degree is unknown. Consider this: AUC is the PK metric of extent of absorption and Cmax the one of rate of absorption. Where the former in comparative BA depends only on the fractions absorbed ƒ (we assume that the doses are identical and clearances are constant), the latter is a composite metric, i.e., depends on ƒ, ka, and ke. Hence, it is clear that they are correlated, but how much?

(Almost) everything is possible in PowerTOST.

library(PowerTOST) # CV and theta0 must be two-element vectors, values can be different # below identical ones for simplicity CV     <- c(0.20, 0.20) # assumed CVs theta0 <- c(0.95, 0.95) # assumed T/R-ratios target <- 0.80 design <- "2x2" rho    <- seq(0.5, 1, 0.1) res    <- data.frame(rho = rho, n = NA_integer_, power = NA_real_) for (j in seq_along(rho)) {   res[j, 2:3] <- sampleN.2TOST(CV = CV, theta0 = theta0, design = design,                                targetpower = target, rho = rho[j],                                print = FALSE)[7:8] } print(res, row.names = FALSE)  rho  n   power  0.5 24 0.82667  0.6 24 0.83335  0.7 22 0.80145  0.8 22 0.81385  0.9 22 0.82943  1.0 20 0.83606

Looks great on paper. The higher the correlation, the higher the power or the other way ’round, the lower the sample size for similar power.

❝ Does this correlation taken into account when computing the sample size?

❝ In my naive opinion, I feel that, under the alternative which stipulates BE, the PK parameters should be correlated between treatments and this correlation should be taken into account.

As you rightly wrote, since R occurs in both tests, likely there is some correlation indeed. But (but!) from where and how will you get it?
In the exercise of two metrics we failed to establish a consistent value (see this lenghty thread for our desperate attempts)… With T1=R AND T2=R we are fishing in the dark. You would have to perform a couple of studies (at least three) to come up with an estimate of $$\rho$$ which you could use in the sample size estimation of the fourth.

Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes