Bioequivalence and Bioavailability Forum • Estimation within-subject CV

Mikkabel
☆

Belgium,
2017-07-07 12:13
(2478 d 13:33 ago)

Posting: # 17515
Views: 7,952

Estimation within-subject CV [Power / Sample Size]

Dear all,

I need your expertise regarding the estimation of the CVintra. Indeed, I am aware that the within subject CV should be calculed based on the sw² obtained following a full (or partially) replicate design study but my question is:

- is it relevant to estimate the within subject CV based on the MSE obtained following a classical CO study (analysed with the EMA requiments e.g period, formulation and subject within sequence as fixed effect)? I precise that it is just an estimation to have an idea if it is relevant to performed a replicate design study or not.

I heard in a symposium that the CV obtained from the MSE should be a good estimator of the real within subject CV, could you confirm ? (or not ;-)

)

Thank you,
Kind regards,

Edit: Category changed; see also this post #1. [Helmut]

ElMaestro ★★★ Denmark, 2017-07-07 12:17 (2478 d 13:29 ago) @ Mikkabel Posting: # 17516 Views: 7,078	Estimation within-subject CV Post reply
	Hi Mikkabel, ❝ I heard in a symposium that the CV obtained from the MSE should be a good estimator of the real within subject CV, could you confirm ?(or not ) In my experience the within subject CV from a 222BE trial often gives you a pretty decent idea about within-subject CV for Ref. — Pass or fail! ElMaestro

Helmut
★★★

Vienna, Austria,
2017-07-07 18:09
(2478 d 07:36 ago)

@ Mikkabel
Posting: # 17519
Views: 7,207

CVwR ~ CVwT ~ CVw?

Post reply

Hi Mikkabel,

❝ is it relevant to estimate the within subject CV based on the MSE obtained following a classical CO study […]? I precise that it is just an estimation to have an idea if it is relevant to performed a replicate design study or not.

❝

❝ I heard in a symposium that the CV obtained from the MSE should be a good estimator of the real within subject CV, could you confirm ? (or not ;-) )

I agree with ElMaestro. If you design a replicate study based on the CV_w of a 2×2×2 crossover you have to assume that CV_wT = CV_wR. In many cases the variabilities of T and R are “similar”. Essentially there are three possibilities:

CV_wT < CV_wR: Since with the EMA’s methods the 90% CI is derived from the pooled CV_w (the MSE from the same model as a 2×2×2) you will gain power.
CV_wT = CV_wR: Fine, as assumed. Desired power.
CV_wT > CV_wR: Here you will loose power, primarily because in the sample size estimation (based on CV_w which is biased upwards by CV_wT) you expected wider BE-limits.

It is difficult to judge whether variabilities are “different”. In the long gone ages of PBE/IBE they were considered “similar” if s_wT/s_wR was within ²∕₃ – ³∕₂. PBE/IBE was never implemented because high sample sizes are required for a meaningful (i.e., based on a CI) comparison of variabilities. Let’s have a look into the FDA’s reference-scaling for NTIDs which contains a comparison of variabilities (additionaly to BE). At the upper cap for scaling at a CV_wR ~21.42% the sample size is essentially driven by the F-test (if CV_wT > CV_wR). Below a comparison with conventional ABE and ABEL for s_wT/s_wR-ratios of ²∕₃, 1, and ³∕₂. In case you speak R:

library(PowerTOST) x <- as.data.frame(matrix(NA, ncol=12, nrow=3)) names(x) <- c("CVwR", "CVwT", "CVw", "CV.r", "s.r", "n.RSABE", "n.ABE", "pw.ABE", "pw.ABE.n2", "n.ABEL", "pw.ABEL", "pw.ABEL.n2") x["s.r"] <- c(2/3, 1, 3/2) x["CVwR"] <- rep(se2CV(0.21179), 3) x["CVwT"] <- se2CV(x["s.r"]*CV2se(x["CVwR"])) x["CV.r"] <- x["CVwT"]/x["CVwR"] x["CVw"] <- as.numeric(mse2CV((CV2mse(x["CVwT"])+CV2mse(x["CVwR"]))/2)) for (j in 1:3) { # loops for functions which don't vectorize x[j, "n.RSABE"] <- sampleN.NTIDFDA(CV=c(x[j, "CVwT"], x[j, "CVwR"]), theta0=0.95, design="2x2x4", print=FALSE, details=FALSE)[["Sample size"]] y <- sampleN.TOST(CV=x[j, "CVw"], theta0=0.95, design="2x2x4", print=FALSE, details=FALSE) x[j, "n.ABE"] <- y[["Sample size"]] x[j, "pw.ABE"] <- y[["Achieved power"]] y <- sampleN.scABEL(CV=c(x[j, "CVwT"], x[j, "CVwR"]), theta0=0.95, design="2x2x4", print=FALSE, details=FALSE) x[j, "n.ABEL"] <- y[["Sample size"]] x[j, "pw.ABEL"] <- y[["Achieved power"]] } for (j in 1:3) { # use the sample size for CVwT = CVwR (2nd row) x[j, "pw.ABE.n2"] <- power.TOST(CV=x[j, "CVw"], theta0=0.95, design="2x2x4", n=x[2, "n.ABE"]) x[j, "pw.ABEL.n2"] <- power.scABEL(CV=c(x[j, "CVwT"], x[j, "CVwR"]), theta0=0.95, design="2x2x4", n=x[2, "n.ABEL"]) } print(round(x, 4), row.names=FALSE)

We get:

   CVwR   CVwT    CVw   CV.r    s.r n.RSABE n.ABE pw.ABE pw.ABE.n2 n.ABEL pw.ABEL pw.ABEL.n2

 0.2142 0.1419 0.1815 0.6625 0.6667      14     8 0.8262    0.9440      8  0.8355     0.9460

 0.2142 0.2142 0.2142 1.0000 1.0000      18    12 0.8626    0.8626     12  0.8663     0.8663

 0.2142 0.3259 0.2750 1.5214 1.5000      32    18 0.8413    0.6602     18  0.8424     0.6649

Higher sample sizes due to the comparison of variabilities. Equal sample sizes for ABE and the EMA’s ABEL since no scaling is allowed (CV_wR <30%).

Now for CV_wR 30%:

 CVwR   CVwT    CVw   CV.r    s.r n.RSABE n.ABE pw.ABE pw.ABE.n2 n.ABEL pw.ABEL pw.ABEL.n2

  0.3 0.1976 0.2534 0.6587 0.6667      16    14 0.8040    0.9183     12  0.8035     0.9227

  0.3 0.3000 0.3000 1.0000 1.0000      22    20 0.8202    0.8202     18  0.8276     0.8276

  0.3 0.4626 0.3877 1.5419 1.5000      40    32 0.8181    0.5941     28  0.8052     0.6040

Same behavior of sample sizes RSABE vs. ABE. Now – due to scaling – ABEL performs better than ABE. Note also the influence of the additional knowledge of CV_wT. If you assume that CV_wT = CV_wR (only information from a 2×2×2 crossover) you would plan with a sample size 18 for ABEL.

We could also vary both CV_wR and CV_wT by looking at extreme s_wT/s_wR-ratios which give still the same pooled CV_w of 30%:

   CVwR   CVwT CVw   CV.r    s.r n.RSABE n.ABE pw.ABE pw.ABE.n2 n.ABEL pw.ABEL pw.ABEL.n2

 0.3560 0.2333 0.3 0.6553 0.6667      22    20 0.8202    0.8202     14  0.8172     0.8951

 0.3000 0.3000 0.3 1.0000 1.0000      22    20 0.8202    0.8202     18  0.8276     0.8276

 0.2334 0.3560 0.3 1.5253 1.5000      32    20 0.8202    0.8202     20  0.8228     0.7815

Same pattern. This is a bad case scenario when fishing in the dark (i.e., knowing only CV_w). The last column is related to the list at the beginning of the post.

In my experience only PPIs (the *prazole-family) in many cases show a higher variability of the reference due to bad gastroresistant coating. Then you will see subjects with low concentrations (which can be removed under certain conditions stated in the MR-GL, Section 6.2.3). We also have to demonstrate that the high CV_wR is not caused by outliers. See the infamous (likely fabricated) reference data set I of the Q&A-document. s_wT∕s_wR 0.7647 (still within 0.667–1.500). After removal of two outliers (subjects 45, 52) we get s_wT∕s_wR 1.0881. Interesting.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

zizou
★

Plzeň, Czech Republic,
2017-07-08 17:29
(2477 d 08:16 ago)

@ Helmut
Posting: # 17523
Views: 6,935

CVwR ~ CVwT ~ CVw?

Post reply

Hi everybody and nobody!

❝ I agree with ElMaestro. If you design a replicate study based on the CV_w of a 2×2×2 crossover you have to assume that CV_wT = CV_wR.

Practically I agree with everyone of you.
Nevertheless I have one nitpicker's question.
We have to assume CV_wT = CV_wR = CV_w? (If we design a replicate study for EMA based on the CV_w of a 2×2×2 crossover.)
Or CV_w can't differ when CV_wT = CV_wR?
Theoretically I can imagine (no real) data of 2x2x4 replicate study - see following idea:
After Test treatment we will have almost the same values for each individual subject (e.g. subject No. 1 value 0.9501 after the first T and 0.9502 after the second T (before ln-transformation)) - for simplicity geometric mean of T values close to 0.95.
In the same way after Reference treatment we will have almost the same values for each individual subject (e.g. subject No. 1 value (1/0.9501) after the first R and (1/0.9502) after the second R (before ln-transformation) - If all the values of R are reciprocal values of T we should get the same intra-subject CV for T and R.) - for simplicity geometric mean of R values equal to (1/0.95).
So GMR T/R of all "pooled" data will be close to 0.95/(1/0.95)=0.95^2=0.9025 and CV_w used for CI calculation will be higher than CV_wT and CV_wR?
I know, in this kind of example the CV_wT and CV_wR are extremely low, but the point is that theoretically the CV_w observed in 2x2 study can be higher than 30% (so we can try replicate design with possible widening of BE acceptance criteria) but the CV_wT and CV_wR (which we don't know from 2x2) can be lower.
Fact or Fiction?

Best regards,
zizou

Helmut
★★★

Vienna, Austria,
2017-07-08 18:45
(2477 d 07:00 ago)

@ zizou
Posting: # 17524
Views: 7,015

CVwR = CVwT < CVw‽

Post reply

Hi Zizou,

❝ Practically I agree with everyone of you.

❝ Nevertheless I have one nitpicker's question.

Being a nitpicker myself I love it.

❝ We have to assume CV_wT = CV_wR = CV_w? (If we design a replicate study for EMA based on the CV_w of a 2×2×2 crossover.)

❝ Or CV_w can't differ when CV_wT = CV_wR?

❝ Theoretically I can imagine (no real) data of 2x2x4 replicate study - see following idea: […]

See my simulated data. I hope I got it right.

subject sequence period treatment data 1 TRTR 1 T 0.90180 1 TRTR 2 R 1.10889 1 TRTR 3 T 0.94490 1 TRTR 4 R 1.05831 2 TRTR 1 T 0.96150 2 TRTR 2 R 1.04004 2 TRTR 3 T 0.95560 2 TRTR 4 R 1.04646 3 TRTR 1 T 0.99610 3 TRTR 2 R 1.00392 3 TRTR 3 T 0.97080 3 TRTR 4 R 1.03008 4 TRTR 1 T 0.96760 4 TRTR 2 R 1.03348 4 TRTR 3 T 0.99630 4 TRTR 4 R 1.00371 5 TRTR 1 T 0.96270 5 TRTR 2 R 1.03875 5 TRTR 3 T 0.91270 5 TRTR 4 R 1.09565 6 TRTR 1 T 0.91130 6 TRTR 2 R 1.09733 6 TRTR 3 T 0.94770 6 TRTR 4 R 1.05519 7 RTRT 1 R 1.03530 7 RTRT 2 T 0.96590 7 RTRT 3 R 1.05250 7 RTRT 4 T 0.95012 8 RTRT 1 R 1.07220 8 RTRT 2 T 0.93266 8 RTRT 3 R 1.02920 8 RTRT 4 T 0.97163 9 RTRT 1 R 1.04180 9 RTRT 2 T 0.95988 9 RTRT 3 R 1.07010 9 RTRT 4 T 0.93449 10 RTRT 1 R 1.00270 10 RTRT 2 T 0.99731 10 RTRT 3 R 1.09080 10 RTRT 4 T 0.91676 11 RTRT 1 R 1.00190 11 RTRT 2 T 0.99810 11 RTRT 3 R 1.08120 11 RTRT 4 T 0.92490 12 RTRT 1 R 1.01090 12 RTRT 2 T 0.98922 12 RTRT 3 R 1.06340 12 RTRT 4 T 0.94038

❝ So GMR T/R of all "pooled" data will be close to 0.95/(1/0.95)=0.95^2=0.9025 …

GLSM 0.9542 (T), 1.0480 (R), PE 0.9104.

❝ … and CV_w used for CI calculation will be higher than CV_wT and CV_wR?

CVs (calculated by the EMA’s fixed effects “Method A”): CV_wT = CV_wR = 3.04%, CV_w = 3.42%. Pooled CV larger the ones of T and R. Simply amazing!

❝ I know, in this kind of example the CV_wT and CV_wR are extremely low, but the point is that theoretically the CV_w observed in 2x2 study can be higher than 30% (so we can try replicate design with possible widening of BE acceptance criteria) but the CV_wT and CV_wR (which we don't know from 2x2) can be lower.

❝ Fact or Fiction?

The former?

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

mittyri
★★

Russia,
2017-07-09 01:47
(2476 d 23:58 ago)

@ Helmut
Posting: # 17526
Views: 6,919

'whatif' plotting from 4X2 to 2X2

Post reply

Dear Helmut, Dear Zizou,

thank you for pointing out this interesting case!
I tried to write some code to see what could we expect with given variabilities from full replicate dataset.
May be it is not so fair since I'm sampling from given data, not from CV directly.
By the way some clues to the question 'what if'

library(dplyr)

library(ggplot2)

alpha <- 0.05

nsims <- 1000



Sample2from4 <- function(Data, datacol = "data", nsims = 1000, alpha = 0.05) {

  # columns should be: subject, sequence, period, treatment; data column could be changed in function args

  ow    <- options()

  options(contrasts = c("contr.treatment", "contr.poly"), digits = 12)

  Data <- as.data.frame(Data)

  Data$subject <- as.factor(Data$subject)

  Data$sequence <- as.factor(Data$sequence)

  Data$period <- as.numeric(Data$period)

  Data$treatment <- as.factor(Data$treatment)

  Data$data <- as.numeric(Data[, match(datacol, colnames(Data))])

  SamplingResults <-

    data.frame(

      PE = numeric(0),

      lowerCI = numeric(0),

      upperCI = numeric(0),

      CV = numeric(0))

  for (i in 1:nsims) {

    # sampling N          1   2  3   4

    # periods to include 12 34  14  23

    Data2periods <-

      Data %>%

      group_by(subject) %>%

      mutate(periodsample = sample(1:4, 1)) %>%

      filter((periodsample == 1 & (period == 1 | period == 2)) |

               (periodsample == 2 & (period == 3 | period == 4)) |

               (periodsample == 3 & (period == 1 | period == 4)) |

               (periodsample == 4 & (period == 2 | period == 3))

      ) %>%

      mutate(periodrank = rank(period))

    Data2periods$period <- as.factor(Data2periods$periodrank)

    mod <- lm(log(data) ~ sequence + treatment + period + subject %in% sequence,

         data = Data2periods)

    PE <- as.numeric(exp(coef(mod)["treatmentT"]))

    CI <- exp(confint(mod, "treatmentT", level = 1 - 2 * alpha))

    mse <- summary(mod)$sigma ^ 2

    CV <- sqrt(exp(mse) - 1)

    SamplingResults <-

      rbind(SamplingResults,

            data.frame(

              PE = PE,

              lowerCI = CI[1],

              upperCI = CI[2],

              CV = CV))

  }

  options(ow) # restore options

  return(SamplingResults)

}



plotSample2from4 <- function(SamplingResults, Data, datacol = "data"){

  ow    <- options()

  options(contrasts = c("contr.treatment", "contr.poly"), digits = 12)

  Data <- as.data.frame(Data)

  Data$subject <- as.factor(Data$subject)

  Data$sequence <- as.factor(Data$sequence)

  Data$period <- as.factor(Data$period)

  Data$treatment <- as.factor(Data$treatment)

  Data$data <- as.numeric(Data[, match(datacol, colnames(Data))])

  Data.modR <- lm(log(data) ~ sequence + subject%in%sequence + period,

                  data=Data[Data$treatment=="R",])

  Data.CVwR  <- sqrt(exp(summary(Data.modR)$sigma ^ 2) - 1)

  Data.modT <- lm(log(data) ~ sequence + subject%in%sequence + period,

                  data=Data[Data$treatment=="T",])

  Data.CVwT  <- sqrt(exp(summary(Data.modT)$sigma ^ 2) - 1)

  Data.CV <- cbind.data.frame(CV = c(Data.CVwR, Data.CVwT), CVw = c("CVwR", "CVwT") )

  MoreThanCVwR <- length(SamplingResults$CV[SamplingResults$CV<Data.CVwR]) / length(SamplingResults$CV)

  MoreThanCVwT <- length(SamplingResults$CV[SamplingResults$CV<Data.CVwT]) / length(SamplingResults$CV)

  Data.CV <- cbind.data.frame(Data.CV, MoreThanCVw = c(MoreThanCVwR, MoreThanCVwT))

  ribbon <- ggplot_build(ggplot() + geom_density(data = SamplingResults, aes(x = CV), colour = "<CVw>"))$data[[1]]

  ribbon$colour[ribbon$x<Data.CVwR & ribbon$x < Data.CVwT ] <- "<CVwR & <CVwT"

  ribbon$colour[ribbon$x>Data.CVwR & ribbon$x > Data.CVwT ] <- ">CVwR & >CVwT"

  title <- paste0("CVwR = ", sprintf("%.2f%%", Data.CVwR*100), 

                  ", CVwT = ", sprintf("%.2f%%", Data.CVwT*100),

                  "; \n2X2 Study CV > CVwR in ", sprintf("%.2f%%", 100-MoreThanCVwR*100),

                  "; 2X2 Study CV > CVwT in ", sprintf("%.2f%%", 100-MoreThanCVwT*100))

  densityplot <-

    ggplot() +

    geom_ribbon(data = ribbon, aes(x = x, ymin = 0, ymax = y, fill = colour), alpha = .4) +

    geom_vline(data = Data.CV,  aes(xintercept = CV, color = CVw, linetype = CVw)) +

    theme_bw() +

    ggtitle(title)+

    labs(x="CV", y = "density")

  print(densityplot)

  return(Data.CV)

  options(ow) # restore options

}

now using the dataset from Helmut's post above:

SamplingResults <- Sample2from4(Data = Data, datacol = "data", nsims = 1000)

plotSample2from4(Data = Data, SamplingResults = SamplingResults, datacol = "data")

Amazing! CVwR and CVwT are far away from mean/median!

What about other examples? EMA full replicate:

SamplingResults <- Sample2from4(Data = ds01, datacol = "PK", nsims = 1000)

plotSample2from4(Data = ds01, SamplingResults = SamplingResults, datacol = "PK")

Looks reasonable. Somewhere between CVwR and CVwT. By the way I would not go with pooled CV for 2X2 knowing this...

—
Kind regards,
Mittyri

ElMaestro
★★★

Denmark,
2017-07-09 02:16
(2476 d 23:30 ago)

@ zizou
Posting: # 17527
Views: 7,046

CVwR ~ CVwT ~ CVw?

Post reply

Hello zizou,

You lost me.
I am sure your point may be good but I don't understand it. Could you reformulate these parts:

❝ Theoretically I can imagine (no real) data of 2x2x4 replicate study - see following idea:

❝ After Test treatment we will have almost the same values for each individual subject (e.g. subject No. 1 value 0.9501 after the first T and 0.9502 after the second T (before ln-transformation)) - for simplicity geometric mean of T values close to 0.95.

❝ In the same way after Reference treatment we will have almost the same values for each individual subject (e.g. subject No. 1 value (1/0.9501) after the first R and (1/0.9502) after the second R (before ln-transformation) - If all the values of R are reciprocal values of T we should get the same intra-subject CV for T and R.) - for simplicity geometric mean of R values equal to (1/0.95).

I am lost here.

❝ So GMR T/R of all "pooled" data will be close to 0.95/(1/0.95)=0.95^2=0.9025 and CV_w used for CI calculation will be higher than CV_wT and CV_wR?

Does that follow automatically? How? Why? CVw used for CI calculation is a dangerous term when we talk EMA?
I tried to look at Helmut's simulated data and could still not figure it out. :confused:

Imagine an old production process for R and a new process for T. This is real life.
Unit-to-unit variability for T is low, for R it is high. When you do a replicated study you see the effect directly on sWR, sWT. It is completely obscured for a 222BE trial. Hence it is hardly surprising that you'll be seeing little difference here and there between the RMSE from a 222BE trial and then the ones derived from the EMA method where replication is involved.

—
Pass or fail!
ElMaestro

zizou
★

Plzeň, Czech Republic,
2017-07-09 04:13
(2476 d 21:33 ago)

(edited by zizou on 2017-07-09 12:50)
@ ElMaestro
Posting: # 17528
Views: 13,098

CVwR ~ CVwT ~ CVw?

Post reply

Dear Captain,
I had only an idea with no connection to reality.

❝ ❝ Theoretically I can imagine (no real) data of 2x2x4 replicate study - see following idea:

The idea was to prepare data of T with intra-subject CV close to zero, for example with mean equal to 0.95 with almost zero standard deviation.
Data of R with the same properties but shifted to different mean, for example 1.05. So I expected intra-subject CV for T data close to zero and intra-subject CV for R data close to zero also. But when we pool the T and R data together, there will be higher differences in the data, so I expected the higher intra-subject CV for all (pooled) data than for T or R data separately.
So the point was completely in theoretical way with the description of example which provides the proof of higher CV_w.
Since Helmut simulated better data. - Thanks! - The description is not needed now.

❝ ❝ So GMR T/R of all "pooled" data will be close to 0.95/(1/0.95)=0.95^2=0.9025 and CV_w used for CI calculation will be higher than CV_wT and CV_wR?

❝ Does that follow automatically? How? Why? CVw used for CI calculation is a dangerous term when we talk EMA?

❝ I tried to look at Helmut's simulated data and could still not figure it out. :confused:

The statements are valid only for the crazy example which I tried to describe swiftly by guesswork. So that doesn't follow automatically. Or maybe when CV_wT = CV_wR and means of T and R differ, the CV_w will be probably slightly higher? So the equality CV_wT = CV_wR = CV_w with the infinite precision is only in a really special case of data. (added by editing)
Helmut's simulated data are much more normal than data in my mind. So I suggest to look only on Helmut's data as an example of data where CV_w is higher than CV_wT and CV_wR.

Best regards,
zizou

Note (to be deleted): typo error in Helmut's simulated data summary

❝ GLSM 0.9542 (T), 1.0480 (R), PE 0.9104.

Edit: THX, done. [Helmut]