Dear all!

Elena777, I guess I misunderstood something. Did you mean aposteriory power or interim power? If interim is 30% go to the next step by the decision tree.

Helmut, what was the final conclusions on the post --> 17971?

Are there any suggestions on how to deal with two metrics in adaptive trials?

For example: a). Let us consider Type II design: first step - estimated power is less than target (80%) for C

1). We calculate 100(1-2α

2). Suppose further we go to the 2

Another example: b). First step - estimated power is less than target (80%) for both metrics and adjusted level CI is outside the range. Should we use the largest observed CV to calculate the total sample size? Would the study be overpowered for the second PK metric? Would it affect the TIE?

To conclude: what is the best strategy to follow in this situation in order to avoid inflation of the TIE and the loss of power?

(Some mad idea: is it possible to make some hybrid monster to combine both C

What ANOVA model should be used for the second stage? By the way, what about the code on R for the full decision tree?;-)]]>

Hi Rocco,

Honestly, I don’t remember

Algebra:$$s\sqrt{\tfrac{n_1+n_2}{n_1n_2}}=\sqrt{s^2(1/n_1+1/n_2)}\;\tiny{\square}$$ Comparison with the data of the example.

- Like in the presentation:

`mean.log <- function(x) mean(log(x), na.rm = TRUE)`

T <- c(100, 103, 80, 110, 78, 87, 116, 99, 122, 82, 68, NA)

R <- c(110, 113, 96, 90, 111, 68, 111, 93, 93, 82, 96, 137)

n1 <- sum(!is.na(T))

n2 <- sum(!is.na(R))

s1.2 <- var(log(T), na.rm = TRUE)

s2.2 <- var(log(R), na.rm = TRUE)

s0.2 <- ((n1 - 1) * s1.2 + (n2 - 1) * s2.2) / (n1 + n2 - 2)

s0 <- sqrt(s0.2)

nu.1 <- n1 + n2 - 2

nu.2 <- (s1.2 / n1 + s2.2 / n2)^2 /

(s1.2^2 / (n1^2 * (n1 - 1)) + s2.2^2 / (n2^2 * (n2 - 1)))

t.1 <- qt(p = 1 - 0.05, df = nu.1)

t.2 <- qt(p = 1 - 0.05, df = nu.2)

PE.log <- mean.log(T) - mean.log(R)

CI.1 <- PE.log + c(-1, +1) * t.1 *

s0 * sqrt((n1 + n2) / (n1 * n2))

CI.2 <- PE.log + c(-1, +1) * t.2 *

sqrt(s1.2 / n1 + s2.2 / n2)

CI.t <- 100 * exp(CI.1)

CI.w <- 100 * exp(CI.2)

fmt <- "%.3f %.3f %.3f %.2f %.2f %.2f"

cat(" method df mean.T mean.R PE CL.lower CL.upper",

"\n t-test",

sprintf(fmt, nu.1, exp(mean.log(T)), exp(mean.log(R)),

100 * exp(PE.log), CI.t[1], CI.t[2]),

"\nWelch-test",

sprintf(fmt, nu.2, exp(mean.log(T)), exp(mean.log(R)),

100 * exp(PE.log), CI.w[1], CI.w[2]), "\n")

method df mean.T mean.R PE CL.lower CL.upper

t-test 21.000 93.554 98.551 94.93 83.28 108.20

Welch-test 20.705 93.554 98.551 94.93 83.26 108.23

- More comfortable with the
`t.test()`

function, where`var.equal = TRUE`

gives the*t*-test and`var.equal = FALSE`

the Welch-test:

`res <- data.frame(method = c("t-test", "Welch-test"), df = NA,`

□

mean.T = NA, mean.R = NA, PE = NA,

CL.lower = NA, CL.upper = NA)

var.equal <- c(TRUE, FALSE)

for (j in 1:2) {

x <- t.test(x = log(T), y = log(R), conf.level = 0.90,

var.equal = var.equal[j])

res[j, 2] <- signif(x[[2]], 5)

res[j, 3:4] <- signif(exp(x[[5]][1:2]), 5)

res[j, 5] <- round(100 * exp(diff(x[[5]][2:1])), 2)

res[j, 6:7] <- round(100 * exp(x[[4]]), 2)

}

print(res, row.names = FALSE)

method df mean.T mean.R PE CL.lower CL.upper

t-test 21.000 93.554 98.551 94.93 83.28 108.20

Welch-test 20.705 93.554 98.551 94.93 83.26 108.23

In the

`t.test()`

`var.equal = FALSE`

is the default because:- Any pre-test is bad practice (might inflate the type I error) and has low power esp. for small sample sizes.

- If \({s_{1}}^{2}={s_{2}}^{2}\;\wedge\;n_1=n_2\), the formula given above reduces to the simple \(\nu=n_1+n_2-2\) anyhow.

- In all other cases the Welch-test is conservative, which is a desirable property.

Thank you so much I think it makes sense. Just one last question.

Where does the formula on slide 10.83 in bebac.at/lectures/Leuven2013WS2.pdf for CI for parallel design come from? I cannot seem to find reference anywhere.

Gracias!]]>

Oller

For the ones who don’t know Latin (mine is extremely rusty): ]]>

Quis custodiet ipsos custodes?

...this problem is not really new. :-)]]>

Hi Yung-jin,

You know what

See my example of “almost error-free software” at the end of this post --> 1928. Out of reach for us.]]>

Hi amer,

Right. It's really a great idea. Software validation is indeed a must and a good thing absolutely. However, I was wondering: should you validate RIVIVC package first? Supposed RIVIVC passes the validation, then should we use RIVIVC to validate Wagner-Nelson (implemented in ivivc for R)? To me, it sounds like using an apple to validate an orange. This report may provide a good start or some ideas. Finally, since the software validation (either the black box or the white box) largely replies on computer and human calculation processes, should we also consider how to validate software validation procedures too in the future?

It has been in my to-do list for a long time. I apologize about the delay. Anyway, this is a great idea too.]]>

I may be able to validate wagner-nelson in ivivc for R (assumes 1-comp model) with numerical deconvolution/convolution using RIVIVC package (no asumption about model structure) but there isn't enough hours in the day. will do at some stage.

suggestion: the ivivc for R can be updated to do numerical deconvolution/convolution option because what if the disposition is not 1 comp, then wanger-nelson is limited.]]>

Hi Elena,

THX for the information. Sections 97/98 of the EEU regulations are a 1:1 translation of the corresponding section about TSDs in the EMA’s BE-GL.

It is difficult to predict how regulators of the EEU

My ranking (

`Power2Stage`

with 1 mio simulations at `theta0=1.25`

. When I give locations of the maximum TIE it is based on a much narrower grid than in the publications (n- Potvin
*et al.*“Method B”^{ 1}

According to the wording of the GL*“… both analyses*[*should be*]*conducted at adjusted significance levels …”*

Maximum inflation of the TIE 0.0490 (with n_{1}12 and CV 24%). Hence, the adjusted α 0.0294 is conservative.

`power.tsd(method="B", alpha=c(0.0294, 0.0294), CV=0.24,`

n1=12, theta0=1.25, nsims=1e6)[["pBE"]]

# [1] 0.048762

However, no inflation of the TIE with a slightly more liberal α 0.0302.

`power.tsd(method="B", alpha=c(0.0302, 0.0302), CV=0.24,`

n1=12, theta0=1.25, nsims=1e6)[["pBE"]]

# [1] 0.049987

- Karalis “TSD-2”
^{ 2}

Futility rule for the total sample size of 150. No inflation of the TIE. Compare with the Potvin B above (0.048762):

`power.tsd.KM(method="B", alpha=c(0.0294, 0.0294), CV=0.24,`

n1=12, theta0=1.25, Nmax=150, nsims=1e6)[["pBE"]]

# [1] 0.041874

However, power may be negatively affected^{ 3,4}and total sample sizes sometimes even larger. Comparison:

`CV <- 0.25`

n1 <- 14

alpha <- c(0.0294, 0.0294)

theta0 <- 0.95

res <- data.frame(method=c("Potvin B", "KM TSD-2"), power=NA,

N.min=NA, perc.5=NA, N.med=NA, perc.95=NA,

N.max=NA, stringsAsFactors=FALSE)

for (j in 1:2) {

if (j == 1) {

x <- power.tsd(method="B", alpha=alpha, CV=CV, n1=n1, theta0=theta0,

Nmax=Inf)

} else {

x <- power.tsd.KM(method="B", alpha=alpha, CV=CV, n1=n1, theta0=theta0,

Nmax=150)

}

res[j, "power"] <- x[["pBE"]]

res[j, "N.min"] <- x[["nrange"]][1]

res[j, 4:6] <- x[["nperc"]]

res[j, "N.max"] <- x[["nrange"]][2]

}

names(res)[c(3:7)] <- c("N min", "N 5%", "N med", "N 95%", "N max")

print(res, row.names=FALSE)

method power N min N 5% N med N 95% N max

Potvin B 0.82372 14 14 30 58 110

KM TSD-2 0.79893 14 14 32 106 150

- Karalis “TSD-1”
^{ 2}

As above but decision scheme similar to Potvin C and α 0.0280.

`power.tsd.KM(method="C", alpha=c(0.0280, 0.0280), CV=0.22,`

n1=12, theta0=1.25, Nmax=150, nsims=1e6)[["pBE"]]

# [1] 0.041893

Compare to the TIE below.

- Potvin
*et al.*“Method C”^{ 1}

Ignoring the sentence of the GL mentioned at #1 above and concentrating on*“… there are many acceptable alternatives and the choice of how much alpha to spend at the interim analysis is at the company’s discretion.”*

With the adjusted α 0.0294 there is a maximum inflation of the TIE of 0.0514 (with n_{1}12 and CV 22%).

`power.tsd(method="C", alpha=c(0.0294, 0.0294), CV=0.22,`

n1=12, theta0=1.25, nsims=1e6)[["pBE"]]

# [1] 0.051426

However, there is no inflation of the TIE for any CV and n_{1}≥18.

If you want to go with Method C, I suggest a more conservative adjusted α 0.0280.

`power.tsd(method="C", alpha=c(0.0280, 0.0280), CV=0.22,`

n1=12, theta0=1.25, nsims=1e6)[["pBE"]]

# [1] 0.049669

- Xu
*et al.*, “Method E”, “Method F”^{ 5}

More powerful than the original methods of the same group of authors since two CV-ranges are considered. “Method E” is an extension of “Method B” and “Method F” of “Method C”. Both have different alphas in the stages and a futility rule based on the 90% CI and a maximum sample size (though*not*as futility). Slight mis-specification of the CV (say, you assumed CV 25% and the CV turns out to be 35%) still controls the TIE.- “Method E”

CV 10–30%:

adjusted α 0.0249, 0.0363, min. n_{1}18, max.n 42, CI within {0.9374, 1.0667}

CV 30–55%:

adjusted α 0.0254, 0.0357, min. n_{1}48, max.n 180, CI within {0.9305, 1.0747}

- “Method F”

CV 10–30%:

adjusted α 0.0248, 0.0364, min. n_{1}18, max.n 42, CI within {0.9492, 1.0535}

CV 30–55%:

adjusted α 0.0259, 0.0349, min. n_{1}48, max.n 180, CI within {0.9350, 1.0695}

`power.tsd.fC(method="B", alpha=c(0.0249, 0.0363), CV=0.30, n1=18,`

fCrit="CI", fClower=0.9374, max.n=42, theta0=1.25,

nsims=1e6)[["pBE"]] # Method E (low CV)

# [1] 0.048916

power.tsd.fC(method="B", alpha=c(0.0254, 0.0357), CV=0.55, n1=48,

fCrit="CI", fClower=0.9305, max.n=180, theta0=1.25,

nsims=1e6)[["pBE"]] # Method E (high CV)

# [1] 0.045969

power.tsd.fC(method="C", alpha=c(0.0248, 0.0364), CV=0.30, n1=18,

fCrit="CI", fClower=0.9492, max.n=42, theta0=1.25,

nsims=1e6)[["pBE"]] # Method F (low CV)

# [1] 0.049194

power.tsd.fC(method="C", alpha=c(0.0259, 0.0349), CV=0.55, n1=48,

fCrit="CI", fClower=0.9350, max.n=180, theta0=1.25,

nsims=1e6)[["pBE"]] # Method F (high CV)

# [1] 0.045471

- “Method E”
- Maurer
*et al.*^{ 6}

The only approach*not*based on simulations and seemingly preferred by the EMA.

That’s the most flexible method because you can specify futility rules on the CI, achievable total power, maximum total sample size. Furthermore, you can base the decision to proceed to the second stage on the PE observed in the first stage (OK, this is supported by the functions of`Power2Stage`

as well but not in the published methods – you would have to perform own simulations). Example:

`power.tsd.in(CV=0.24, n1=12, theta0=1.25, fCrit="No",`

ssr.conditional="no", nsims=1e6)[["pBE"]]

# [1] 0.04642

Let us compare the method with data of Example 2 given by Potvin*et al.*Note that in this method you perform separate ANOVAs, one in the interim and one in the final analysis. In Example 2 we had 12 subjects in stage 1 and with both methods a second stage with 8 subjects. The final PE was 101.45% with a 94.12% CI of 88.45–116.38%. I switched off futility criteria and kept all other defaults.

`interim.tsd.in(GMR1=1.0876, CV1=0.18213, n1=12,`

fCrit="No", ssr.conditional="no")

TSD with 2x2 crossover

Inverse Normal approach

- Maximum combination test with weights for stage 1 = 0.5 0.25

- Significance levels (s1/s2) = 0.02635 0.02635

- Critical values (s1/s2) = 1.9374 1.9374

- BE acceptance range = 0.8 ... 1.25

- Observed point estimate from stage 1 is not used for SSR

- Without conditional error rates and conditional (estimated target) power

Interim analysis after first stage

- Derived key statistics:

z1 = 3.10000, z2 = 1.70344,

Repeated CI = (0.92491, 1.27891)

- No futility criterion met

- Test for BE not positive (not considering any futility rule)

- Calculated n2 = 8

- Decision: Continue to stage 2 with 8 subjects

Similar outcome. Not BE and second stage with 8 subjects.

`final.tsd.in(GMR1=1.0876, CV1=0.18213, n1=12,`

GMR2=0.9141, CV2=0.25618, n2=8)

TSD with 2x2 crossover

Inverse Normal approach

- Maximum combination test with weights for stage 1 = 0.5 0.25

- Significance levels (s1/s2) = 0.02635 0.02635

- Critical values (s1/s2) = 1.93741 1.93741

- BE acceptance range = 0.8 ... 1.25

Final analysis after second stage

- Derived key statistics:

z1 = 2.87952, z2 = 2.60501,

Repeated CI = (0.87690, 1.17356)

Median unbiased estimate = 1.0135

- Decision: BE achieved

Passed BE as well. PE 101.35 with a 94.73% CI of 87.69–117.36%.

Acceptance in Belarus & Russia – no idea. Might well be that their experts never have seen such a study before.

- Maurer
*et al.*

- Xu
*et al.*“Method F”

- Xu
*et al.*“Method E”

- Potvin
*et al.*“Method C” (modified α 0.0280)

- Potvin
*et al.*“Method B” (modified α 0.0302)

- Potvin
*et al.*“Method C” (original α 0.0294)

- Potvin
*et al.*“Method B” (original α 0.0294)

- Karalis “TSD-1”

- Karalis “TSD-2”

If you performed the second stage, it’s mandatory to pool the data. None of the methods contains

Nothing in the guidelines, but mentioned in the EMA’s Q&A document. However, that’s superfluous. If you perform a sample size estimation, in all software the minimum stage 2 sample size will be 2 anyhow (if odd, rounded up to the next even to obtain balanced sequences). In the functions of

`Power2Stage`

you can used the argument `min.n2=2`

and will Only if you are a nerd, read the next paragraph. :-D

The conventional sample size estimation does not take the stage-term in the final analysis into account. If you prefer to use braces with suspenders, use the function `sampleN2.TOST()`

.

`CV <- 0.25`

n1 <- 12

res <- data.frame(method=c("PowerTOST::sampleN.TOST()",

"Power2Stage::sampleN2.TOST()"),

n1=n1, n2=NA, power=NA, stringsAsFactors=FALSE)

for (j in 1:2) {

if (j == 1) {

x <- PowerTOST::sampleN.TOST(alpha=0.0294, CV=CV, print=FALSE)[7:8]

res[j, 3] <- x[1] - n1

} else {

x <- Power2Stage::sampleN2.TOST(alpha=0.0294, CV=CV, n1=n1)[8:9]

res[j, 3] <- x[1]

}

res[j, "power"] <- x[["Achieved power"]]

}

print(res, row.names=FALSE)

method n1 n2 power

PowerTOST::sampleN.TOST() 12 22 0.8127230

Power2Stage::sampleN2.TOST() 12 22 0.8141106

Practically it is unlikely to get a difference in sample sizes…

In #5 the minimum is 4 because you perform a separate ANOVA in the second stage. One word of caution: If you have a nasty drug (dropouts due to AEs) take care that you don’t end up with <3 subjects – otherwise the ANOVA would not be possible.

In designing a study I recommend to call the functions with the arguments

`theta0`

and `CV`

, which are your best guesses. Don’t confuse that with the argument `GMR`

, which is `CV <- 0.25`

n1 <- 0.8*PowerTOST::sampleN.TOST(CV=CV, print=FALSE)[["Sample size"]]

n1 <- ceiling(n1 + ceiling(n1) %% 2)

lo <- ceiling(1.5*n1 + ceiling(1.5*n1) %% 2)

hi <- ceiling(3*n1 + ceiling(3*n1) %% 2)

Nmax <- c(seq(lo, hi, 4), Inf)

res <- data.frame(Nmax=Nmax, power=NA)

for (j in seq_along(Nmax)) {

res$power[j] <- Power2Stage::power.tsd(CV=CV, n1=n1,

Nmax=Nmax[j])[["pBE"]]

}

print(res, row.names=FALSE)

Nmax power

36 0.70564

40 0.74360

44 0.77688

48 0.80214

52 0.81854

56 0.82976

60 0.83596

64 0.83957

68 0.84156

72 0.84153

Inf 0.84244

A futility rule of 48 looks good. Let’s explore the details:

`Power2Stage::power.tsd(CV=CV, n1=n1, Nmax=48)`

TSD with 2x2 crossover

Method B: alpha (s1/s2) = 0.0294 0.0294

Target power in power monitoring and sample size est. = 0.8

Power calculation via non-central t approx.

CV1 and GMR = 0.95 in sample size est. used

Futility criterion Nmax = 48

BE acceptance range = 0.8 ... 1.25

CV = 0.25; n(stage 1) = 24; GMR = 0.95

1e+05 sims at theta0 = 0.95 (p(BE) = 'power').

p(BE) = 0.80214

p(BE) s1 = 0.63203

Studies in stage 2 = 29.06%

Distribution of n(total)

- mean (range) = 27.6 (24 ... 48)

- percentiles

5% 50% 95%

24 24 44

Not that bad.

However, futility rules can be counterproductive because you have to come up with a “best guess” CV – which is actually against the “sprit” of TSDs. Homework:

`Power2Stage::power.tsd(CV=0.30, n1=24, Nmax=48)`

:thumb down:As ElMaestro wrote above --> 20565 you have to perform own simulations if you are outside the published methods (GMR, target power, n

A final reminder: In the sample size estimation use the fixed

`GMR`

(not the observed one), unless the method allows that. - Potvin D, DiLiberti CE, Hauck WW, Parr AF, Schuirmann DJ, Smith RA.
*Sequential design approaches for bioequivalence studies with crossover designs.*Pharm Stat. 2008; 7(4): 245–62. doi:10.1002/pst.294.

- Karalis V.
*The role of the upper sample size limit in two-stage bioequivalence designs.*Int J Pharm. 2013; 456: 87–94. doi:j.ijpharm.2013.08.013.

- Fuglsang A.
*Futility Rules in Bioequivalence Trials with Sequential Designs.*AAPS J. 2014; 16(1): 79–82. doi:10.1208/s12248-013-9540-0.

- Schütz H.
*Two-stage designs in bioequivalence trials.*Eur J Clin Pharmacol. 2015; 71(3): 271–81. doi:10.1007/s00228-015-1806-2.

- Xu J, Audet C, DiLiberti CE, Hauck WW, Montague TH, Parr TH, Potvin D, Schuirmann DJ.
*Optimal adaptive sequential designs for crossover bioequivalence studies.*Pharm Stat. 2016; 15(1): 15–27. doi:10.1002/pst.1721.

- Maurer W, Jones B, Chen Y.
*Controlling the type 1 error rate in two-stage sequential designs when testing for average bioequivalence.*Stat Med. 2018; 37(10): 1587–607. doi:10.1002/sim.7614.

- Molins E, Cobo E, Ocaña J.
*Two-stage designs versus European scaled average designs in bioequivalence studies for highly variable drugs: Which to choose?*Stat Med. 2017; 36(30): 4777–88. doi:10.1002/sim.7452.

Hi all,

If in doubt use Heparin in the validation and in the study vacutainers. Problem solved.]]>

Dear Helmut and ElMaestro,

Thank you very much for responding me shortly. We are planning to submit applications in Belarus (the country I`m from) and in Russia. Hope method C will be OK for them.

Another 2 questions:

1. Is it a good idea to add the statement that, in case of conducting stage 2, data from both stages will be pooled by default, without evaluation of differences between stages?

2. Is there any established minimum for a number of subjects that should be included in stage 2 (e.g. at least 2, or at least 1 for each sequence (TR/RT))?]]>

Heparin is a polyanion, so: Calculate the amount instilled, the volume of fresh blood used to "flush" this amount to the blood collecting device, the resulting concentration in the blood/plasma sample. Use common sense to asses these numbers.

If in doubt (due to physico-chemical properties of analyte), do some in vitro studies with fresh blood/heparin.]]>

Hi Monic,

this is actually a pretty good question, in my opinion. I don't have a good answer, unfortunately. I can only offer to ramble a bit.

You give a tiny amount of Hep in the catheter with the intent to keep it exactly there. No Hep is intended to be injected/infused into the blood stream. Yet, there will of course be some diffusion.

Although the level of circulatory Hep will be low, who knows if they could interfere?

I have seen on several occasions how catheters may get clogged. In those cases the phlebotomist may do multiple push/pulls with Hep solution to clear the passage (and the subject will flinch in pain, but that is another story).

In those cases, surely you will get much more Hep into the blood stream. Will you get enough to interfere? My gut feeling says "no", but who am I to actually qualify that? May even be depending on the type of Hep, who knows?!?

There is other stuff in the Hep solution, most often Glucose but sometimes there is a gelling agent, because this solution has to be much like a marmelade to serve its purpose. So what about that gelling agent?

Technicality: Hep is not infused during sampling. It is only instilled in the catheter. This, I assume, is also what you meant.]]>

Hi everyone,

Is it required to evaluate the interaction of "heparin" with your analyte of interest during concomitant medication effect as a part of bio-analytical method validation parameter?

since heparin is infused during blood collection to prevent formation of blood clots in clinical phase.

Note: The anticoagulant used for the blood collection in vacutainer is K2EDTA.

Thanks

Edit: Category changed; see also this post #1 --> 16205. [Helmut]]]>

Hi Yung-jin,

:lol3: I’m always a little bit skeptic when one tries to validate own code (including f**ing expensive “validation packs” sold by vendors of commercial software). Hence, we have to find

Dear Helmut,

Indeed. So I don't think I am qualified to do that... Thanks God. :-D]]>

Hi amer,

Nice question. Oral bioavailability term (

Hi Lee,

Thanks very much. Yes, it is;however, i don't see the oral bioavailability term in the [PCp] equation in the R-code provided. Shouldn't that include bioavailability when doing convolution? Does it assume, in its current for, that the bioavailability is 100%?

Unfortunately, i dont have access to WLN.

Cheers,]]>

Hi Helmut,

the probability is close to nil due to a nightmare called deconvolution...]]>

Hi Yung-jin and amer,

Since the source code of

`ivivc for R`

is accessible, one can perform a white box validation. Requires an expert R-coder who is also knowledgeable with the scientific background of @Yung-jin: Though Certara granted me with a ‘named license’ of Phoenix’ ‘IVIVC Toolkit’, I never used it so far. Furthermore, my experience with

`ivivc for R`

are zero.]]>