Rocco_M
☆

Mexico,
2019-09-03 18:45

Posting: # 20537
Views: 3,087

## Input for CV when small first-in-man has been run [Power / Sample Size]

Hi all,

I have a question about initial input for CV into a sample size calculation comparing two formulations when all that has been run is a single study, say, a first in man. Since it is not correct to use sd/mean from the first formulation as your CV, but if it is all you have (say, you measured log(AUC) from ten patients for a single formulation (ie, only for the reference), what would you input as CV for a subsequent parallel (say) study between this formulation and another one not yet available) in order to estimate sample size?

I see people will typically enter sample sd / sample mean of the single formulation, but I feel this is incorrect.
Helmut
★★★

Vienna, Austria,
2019-09-05 14:05

@ Rocco_M
Posting: # 20543
Views: 2,795

## Geometric mean and CV

Hi Rocco,

» I see people will typically enter sample sd / sample mean of the single formulation, but I feel this is incorrect.

You are right. PK metrics like AUC and Cmax follow a lognormal distribution and hence, arithmetic means and their SDs / CVs are wrong (i.e., are positively biased).

If you plan for a parallel design you should use the geometric CV.
$$\overline{x}_{log}=\frac{\sum (log(x_i))}{n}$$ $$\overline{x}_{geo}=\sqrt[n]{x_1x_2\ldots x_n}=e^{\overline{x}_{log}}$$ $$s_{log}^{2}=\frac{\sum (log(x_i-\overline{x}_{log}))}{n-1}$$ $$CV_{log}=\sqrt{e^{s_{log}^{2}}-1}$$ Only if you don’t have access to the raw data, you would need simulations.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Rocco_M
☆

Mexico,
2019-09-05 23:49

@ Helmut
Posting: # 20545
Views: 2,738

## Geometric mean and CV

Thanks. But what I Do not understand is, is it even reasonable to use the geometric CV for the reference? Isn’t the CV you want to input into a sample calculation the CV corresponding to the difference of the Test and Reference? Since you do not have the Test group here, I am a bit confused as to what using the geometric CV of only the reference tells you. In your slides, you have a lone that says “if you have only mean and sd of the reference, a pilot study is unavoidable.”

What am I missing?

Thanks.

Helmut
★★★

Vienna, Austria,
2019-09-06 00:15

@ Rocco_M
Posting: # 20546
Views: 2,763

## Geometric mean and CV

Hi Rocco,

» […] is it even reasonable to use the geometric CV for the reference? Isn’t the CV you want to input into a sample calculation the CV corresponding to the difference of the Test and Reference? Since you do not have the Test group here, I am a bit confused as to what using the geometric CV of only the reference tells you. In your slides, you have a lone that says “if you have only mean and sd of the reference, a pilot study is unavoidable.”
»
» What am I missing?

Oh dear, my slides always give only half of the picture (my is missing)…
Let’s start from a 2×2×2 crossover. We have the within-subject variabilities of T and R (CVwT and CVwR). Since this is not a replicate design, they are not accessible and pooled into the common CVw.1 One of the assumptions in ANOVA are identical variances. If they are not truly equal (say CVwT < CVwR) the CI is inflated: The “good” T is punished by the “bad” R.
Similar in a parallel design. You can assume that CVwT = CVwR and CVbT = CVbR and therefore, use the (pooled, total) CVp of your FIM study. Here it is the other way ’round (you have the CVp of R). If the CV of T is higher, bad luck, power compromised. If both are ~ equal, fine. If it is lower, you gain power. There is no free lunch.
If you are cautious: Pilot study or a Two-Stage-Design.2 For the latter I recommend the function power.tsd.p() of package Power2Stage for R. A reasonable stage 1 sample size is ~80% of what you estimate with sampleN.TOST(alpha=0.05...) of package PowerTOST and the CVp of your FIM.

1. Of course, the same holds for between-subject variabilities: CVbT and CVbR pooled to CVb.
2. Fuglsang A. Sequential Bioequivalence Approaches for Parallel Design. AAPS J. 2014; 16(3):373–8. doi:10.1208/s1224801495711. free resource.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Rocco_M
☆

Mexico,
2019-09-06 17:49

@ Helmut
Posting: # 20548
Views: 2,675

## Geometric mean and CV

Thanks. So basically your analysis follows from the fact that the variance of the difference of T and R equal the sum of the variance of T and the variance of R, correct? And you are using the geometric CV as the estimate of CVp for R?

Helmut
★★★

Vienna, Austria,
2019-09-06 18:15

@ Rocco_M
Posting: # 20549
Views: 2,689

## Geometric mean and CV

Hi Rocco,

» So basically your analysis follows from the fact that the variance of the difference of T and R equal the sum of the variance of T and the variance of R, correct?

Well, you have four variance components (s²wR, s²wT, s²bT, s²bR). Then
1. Full replicate designs
All are identifiable.
2. 2×2×2 crossover (balanced and complete for simplicity – otherwise, weighting is required)
s²w = (s²wR + s²wT)/2 and s²b = (s²bT + s²bR)/2.
3. 2 group parallel
Only the pooled (total) s²p. With a tricky mixed-effects model you could get s²pT and s²pR.
4. One treatment (FIM)
Only s²p.
Hence, if you want to plan #3 based on #4 you have to assume that the variances (within, between) of T and R are at least similar.

» And you are using the geometric CV as the estimate of CVp for R?

Yes.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Belgium?,
2019-09-06 19:03

@ Helmut
Posting: # 20550
Views: 2,677

## Geometric mean and CV

Hi Hötzi,

» 3. 2 group parallel
»    Only the pooled (total) s²p. With a tricky mixed-effects model you could
»    get s²pT and s²pR.

2 group parallel: This tricky model may not be so tricky after all, but may be overkill. I believe you will get the same result as you would obtain from doing a plain sample standard deviation on T and R subsets, respectively.

I could be wrong, but...

Best regards,
ElMaestro

"Pass or fail" (D. Potvin et al., 2008)
Helmut
★★★

Vienna, Austria,
2019-09-07 09:57

@ ElMaestro
Posting: # 20551
Views: 2,636

## Geometric mean and CV

Hi ElMaestro,

» » 3. 2 group parallel
» »    Only the pooled (total) s²p. With a tricky mixed-effects model you
» »    could get s²pT and s²pR.
»
» This tricky model […] may be overkill. I believe you will get the same result as you would obtain from doing a plain sample standard deviation on T and R subsets, respectively.

KISS. You are absolutely right.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Rocco_M
☆

Mexico,
2019-09-09 13:31
(edited by Rocco_M on 2019-09-09 13:51)

@ Helmut
Posting: # 20557
Views: 2,571

## Geometric mean and CV

My apology. Not exactly sure I understand what other poster be saying here 100%. If you be plan a parallel based on FIM: if you have one pooled variance from FIM, is it enough to enter into a sample size calculation? Or does one also need have assumption for pooled variance of other arm, and then take calculated variance of difference based on the FIM variance and assumed other arm pooled variance? This is what I do not understand.
-RoccoM.
Mexico

Helmut
★★★

Vienna, Austria,
2019-09-09 14:09

@ Rocco_M
Posting: # 20558
Views: 2,544

## Assumptions…

Hi Rocco,

see what I wrote above.
Since you have no idea about the new formulation you have to assume indeed that the variances of T and R are at least similar. If you don’t like this assumption, there are just two ways out:
1. Pilot study or
Even if you are wary and walk this road, you have to assume things. The CV and the T/R-ratio are estimates and therefore, uncertain. Relying on assumptions is part of the game. Only death is certain.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Rocco_M
☆

Mexico,
2019-09-09 16:35

@ Helmut
Posting: # 20560
Views: 2,534

## Assumptions…

Gracias and sorry for confusion. Here is what I think I am not getting. When you run a FIM, you do not have two populations. You have a treatment and (perhaps) a control. So if you have a geometric CV, say, CV1 calculated from that one treatment sample in the FIM. [It is not pooled variance, is it? There is only one sample] there and then use it to design a parallel follow-up, do you need to assume that the CV of the other arm, call it CV2, is equal to CV1, and then CVp = pooled(CV1,CV2) as input into sample size formula, eg, sampleNTOST?

Much apology if I am missing this point. I do not seem to understand what are the components of the pooled variance that go into the sample size computation.
Helmut
★★★

Vienna, Austria,
2019-09-09 17:19

@ Rocco_M
Posting: # 20562
Views: 2,517

## Assumptions…

¡Hola Rocco!

» When you run a FIM, you do not have two populations. You have a treatment and (perhaps) a control. So if you have a geometric CV, say, CV1 calculated from that one treatment sample in the FIM.

Correct, so far.

» It is not pooled variance, is it? There is only one sample…

Here you err. Although we have just one sample, we have two variances, between- (that’s obvious) and within-subjects (not so obvious). The fact that you administered the drug on one occasion does not mean the within-subject disappears. Administer the same drug the next day and you will get different concentrations. Hence, within-subject variance is always there, we can only not estimate it. We get only the total (or pooled) variance. Given, generally CVb > CVw but there are cases where it is the other way ’round. We simply don’t know.

» … there and then use it to design a parallel follow-up, do you need to assume that the CV of the other arm, call it CV2, is equal to CV1,…

Yes.

» … and then CVp = pooled(CV1,CV2) as input into sample size formula, eg, sampleNTOST?

Wait a minute. CV2 is unknown until we performed the parallel study. Therefore, simply plug in the one you found in the FIM study.

» […] I do not seem to understand what are the components of the pooled variance that go into the sample size computation.

See my previous post, esp. case #4. We have one variance, which is pooled from s²w and s²b. Maybe the terminology is confusing. Pooling does not mean that we have the individual components. We know only the result and there is is an infinite number of combinations which gives the same result. However, that’s not important. In planning the parallel design you need only the CV1 and have to assume that CV2=CV1.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Rocco_M
☆

Mexico,
2019-09-13 21:28

@ Helmut
Posting: # 20595
Views: 2,412

## Assumptions…

Thank you so much I think it makes sense. Just one last question.
Where does the formula on slide 10.83 in bebac.at/lectures/Leuven2013WS2.pdf for CI for parallel design come from? I cannot seem to find reference anywhere.
Gracias!
Helmut
★★★

Vienna, Austria,
2019-09-14 00:16

@ Rocco_M
Posting: # 20596
Views: 2,429

## t-test & Welch-test

Hi Rocco,

» Where does the formula on slide 10.83 in bebac.at/lectures/Leuven2013WS2.pdf for CI for parallel design come from? I cannot seem to find reference anywhere.

Honestly, I don’t remember why I simplified the commonly used formula.
Algebra:$$s\sqrt{\tfrac{n_1+n_2}{n_1n_2}}=\sqrt{s^2(1/n_1+1/n_2)}\;\tiny{\square}$$ Comparison with the data of the example.
• Like in the presentation:
mean.log <- function(x) mean(log(x), na.rm = TRUE) T      <- c(100, 103, 80, 110,  78, 87, 116, 99, 122, 82, 68,  NA) R      <- c(110, 113, 96,  90, 111, 68, 111, 93,  93, 82, 96, 137) n1     <- sum(!is.na(T)) n2     <- sum(!is.na(R)) s1.2   <- var(log(T), na.rm = TRUE) s2.2   <- var(log(R), na.rm = TRUE) s0.2   <- ((n1 - 1) * s1.2 + (n2 - 1) * s2.2) / (n1 + n2 - 2) s0     <- sqrt(s0.2) nu.1   <- n1 + n2 - 2 nu.2   <- (s1.2 / n1 + s2.2 / n2)^2 /           (s1.2^2 / (n1^2 * (n1 - 1)) + s2.2^2 / (n2^2 * (n2 - 1))) t.1    <- qt(p = 1 - 0.05, df = nu.1) t.2    <- qt(p = 1 - 0.05, df = nu.2) PE.log <- mean.log(T) - mean.log(R) CI.1   <- PE.log + c(-1, +1) * t.1 *           s0 * sqrt((n1 + n2) / (n1 * n2)) CI.2   <- PE.log + c(-1, +1) * t.2 *           sqrt(s1.2 / n1 + s2.2 / n2) CI.t   <- 100 * exp(CI.1) CI.w   <- 100 * exp(CI.2) fmt    <- "%.3f %.3f %.3f %.2f    %.2f   %.2f" cat("    method     df mean.T mean.R    PE CL.lower CL.upper",     "\n    t-test",     sprintf(fmt, nu.1, exp(mean.log(T)), exp(mean.log(R)),             100 * exp(PE.log), CI.t[1], CI.t[2]),     "\nWelch-test",     sprintf(fmt, nu.2, exp(mean.log(T)), exp(mean.log(R)),             100 * exp(PE.log), CI.w[1], CI.w[2]), "\n")     method     df mean.T mean.R    PE CL.lower CL.upper     t-test 21.000 93.554 98.551 94.93    83.28   108.20 Welch-test 20.705 93.554 98.551 94.93    83.26   108.23

• More comfortable with the t.test() function, where var.equal = TRUE gives the t-test and var.equal = FALSE the Welch-test:
res <- data.frame(method = c("t-test", "Welch-test"), df = NA,                              mean.T = NA, mean.R = NA, PE = NA,                              CL.lower = NA, CL.upper = NA) var.equal <- c(TRUE, FALSE) for (j in 1:2) {   x <- t.test(x = log(T), y = log(R), conf.level = 0.90,               var.equal = var.equal[j])   res[j, 2]   <- signif(x[[2]], 5)   res[j, 3:4] <- signif(exp(x[[5]][1:2]), 5)   res[j, 5]   <- round(100 * exp(diff(x[[5]][2:1])), 2)   res[j, 6:7] <- round(100 * exp(x[[4]]), 2) } print(res, row.names = FALSE)     method     df mean.T mean.R    PE CL.lower CL.upper     t-test 21.000 93.554 98.551 94.93    83.28   108.20 Welch-test 20.705 93.554 98.551 94.93    83.26   108.23
The formula for Satterthwaite’s approximation of the degrees of freedom given in slide 11 contained typos (corrected in the meantime). Of course,$$\nu\approx\frac{\left(\frac{{s_{1}}^{2}}{n_1}+\frac{{s_{2}}^{2}}{n_2}\right)^2}{\frac{{s_{1}}^{4}}{n{_{1}}^{2}(n_1-1)}+\frac{{s_{2}}^{4}}{n{_{2}}^{2}(n_2-1)}}$$Satterthwaite’s approximation adjusts both for unequal variances and group sizes. The conventional t-test is fairly robust against the former but less so for the latter.
In the R function t.test() var.equal = FALSE is the default because:
• Any pre-test is bad practice (might inflate the type I error) and has low power esp. for small sample sizes.
• If $${s_{1}}^{2}={s_{2}}^{2}\;\wedge\;n_1=n_2$$, the formula given above reduces to the simple $$\nu=n_1+n_2-2$$ anyhow.
• In all other cases the Welch-test is conservative, which is a desirable property.

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Rocco_M
☆

Mexico,
2019-09-17 22:02
(edited by Rocco_M on 2019-09-17 23:43)

@ Helmut
Posting: # 20608
Views: 2,225

## sampleN.TOST and CI.BE

Okay gracias. That was a brain fart on my end. I now see that it is equivalent to 1/n1 + 1/n2.

One last question. If I use sampleN.tost and input
sampleN.TOST(CV=.3, theta0=1.0, theta1= 0.8, theta2=1.25, logscale=TRUE, alpha=0.05, targetpower=0.9, design="parallel")

I get sample size minimum be 78 total.

But then if I run CI.BE(pe=1.0, CV=.3, design="parallel", n=24)
I get CI = approx [0.81, 1.23].

This confuse me. Shouldn’t I need to enter n at least 78 in order to get CI within [.8, 1.25]?

Maybe I confuse concepts. In other words, what are implications if study *meet* bioequivalence but is underpowered?
Helmut
★★★

Vienna, Austria,
2019-09-18 11:04

@ Rocco_M
Posting: # 20609
Views: 2,195

## assumptions vs. realizations

Hi Rocco,

» sampleN.TOST(CV=.3, theta0=1.0, theta1= 0.8, theta2=1.25, logscale=TRUE, alpha=0.05, targetpower=0.9, design="parallel")
» I get sample size minimum be 78 total.

Correct.1

» But then if I run CI.BE(pe=1.0, CV=.3, design="parallel", n=24)
» I get CI = approx [0.81, 1.23].

Correct again.

» This confuse me. Shouldn’t I need to enter n at least 78 in order to get CI within [.8, 1.25]?

Nope. CV and theta0 in sampleN.TOST() are assumptions (before the study), whereas CV and pe in CI.BE() are realizations (observations in the study). I’m not a friend of post hoc (a posteriori, restrospective) power but let’s check that:

library(PowerTOST) power.TOST(CV = 0.3, theta0 = 1, n = 24, design = "parallel") # [1] 0.1597451

That’s much lower than the 0.9 you targeted and what you would have got with 78 subjects. Try this one:

n      <- c(24, 78) theta0 <- seq(0.80, 1.25, length.out = 201) res    <- data.frame(n = rep(n, each = length(theta0)),                      theta0 = rep(theta0, length(n)),                      power = NA) j      <- 0 for (k in seq_along(n)) {   for (l in seq_along(theta0)) {     j <- j + 1     res$power[j] <- power.TOST(CV = 0.3, n = res$n[j],                                theta0 = res$theta0[j], design = "parallel") } } plot(theta0, rep(1, length(theta0)), type = "n", ylim = c(0, 1), log = "x", yaxs = "i", xlab = "theta0", ylab = "power", las = 1) grid() abline(h = c(0.05, 0.9), col = "lightgrey", lty = 2) box() col <- c("red", "blue") for (k in seq_along(n)) { lines(x = theta0, y = res$power[res$n == n[k]], lwd = 3, col = col[k]) text(x = 1, y = max(res$power[res$n == n[k]]), pos = 3, labels = signif(max(res$power[res\$n == n[k]]), 4)) } legend("topright", legend = rev(paste("n =", n)),        bg = "white", col = rev(col), lwd = 3)

» In other words, what are implications if study *meet* bioequivalence but is underpowered?

None. Sample size estimation is always based on assumptions. If they turn out to be wrong (higher CV, PE worse than theta0, more dropouts than anticipated), you might still meet BE by luck. As ElMaestro once wrote:
Being lucky is not a crime.

But any confirmatory study (like BE) requires an appropriate sample size estimation. There are two problems which  could  should lead already to rejection of the protocol by the IEC.
1. If you assume a CV of 0.3 and a theta0 of 1 (I would not be that optimistic) and suggest in the protocol a sample size of 24. You can also press the in FARTSSIE only to be told that
“90% power not attainable even if the means are equal. The highest attainable power is 12.7%.”2
Not a good idea.
2. Not in your case of a new drug but if the IEC knows CVs from other studies. Don’t try to cheat (“assume” a much lower one, i.e., 0.16 instead of 0.30 in order to end up with just 24 subjects).

1. See the man-pages of the functions in PowerTOST. In many cases there is no need to specify the defaults (alpha=0.05, theta1=0.8, theta2=1.25, logscale=TRUE). Makes your live easier.
2. Calculated by the non-central t-distribution. You can confirm that when you run power.TOST(..., method = "nct").

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes