Mikkabel
☆    

Belgium,
2017-07-07 12:13
(2478 d 13:33 ago)

Posting: # 17515
Views: 7,952
 

 Estimation within-subject CV [Power / Sample Size]

Dear all,

I need your expertise regarding the estimation of the CVintra. Indeed, I am aware that the within subject CV should be calculed based on the sw² obtained following a full (or partially) replicate design study but my question is:

- is it relevant to estimate the within subject CV based on the MSE obtained following a classical CO study (analysed with the EMA requiments e.g period, formulation and subject within sequence as fixed effect)? I precise that it is just an estimation to have an idea if it is relevant to performed a replicate design study or not.

I heard in a symposium that the CV obtained from the MSE should be a good estimator of the real within subject CV, could you confirm ? (or not ;-))

Thank you,
Kind regards,


Edit: Category changed; see also this post #1. [Helmut]
ElMaestro
★★★

Denmark,
2017-07-07 12:17
(2478 d 13:29 ago)

@ Mikkabel
Posting: # 17516
Views: 7,078
 

 Estimation within-subject CV

Hi Mikkabel,

❝ I heard in a symposium that the CV obtained from the MSE should be a good estimator of the real within subject CV, could you confirm ?(or not ;-))


In my experience the within subject CV from a 222BE trial often gives you a pretty decent idea about within-subject CV for Ref.

Pass or fail!
ElMaestro
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2017-07-07 18:09
(2478 d 07:36 ago)

@ Mikkabel
Posting: # 17519
Views: 7,207
 

 CVwR ~ CVwT ~ CVw?

Hi Mikkabel,

❝ is it relevant to estimate the within subject CV based on the MSE obtained following a classical CO study […]? I precise that it is just an estimation to have an idea if it is relevant to performed a replicate design study or not.


❝ I heard in a symposium that the CV obtained from the MSE should be a good estimator of the real within subject CV, could you confirm ? (or not ;-))


I agree with ElMaestro. If you design a replicate study based on the CVw of a 2×2×2 crossover you have to assume that CVwT = CVwR. In many cases the variabilities of T and R are “similar”. Essentially there are three possibilities:
  1. CVwT < CVwR: Since with the EMA’s methods the 90% CI is derived from the pooled CVw (the MSE from the same model as a 2×2×2) you will gain power.
  2. CVwT = CVwR: Fine, as assumed. Desired power.
  3. CVwT > CVwR: Here you will loose power, primarily because in the sample size estimation (based on CVw which is biased upwards by CVwT) you expected wider BE-limits.
It is difficult to judge whether variabilities are “different”. In the long gone ages of PBE/IBE they were considered “similar” if swT/swR was within 2332. PBE/IBE was never implemented because high sample sizes are required for a meaningful (i.e., based on a CI) comparison of variabilities. Let’s have a look into the FDA’s reference-scaling for NTIDs which contains a comparison of variabilities (additionaly to BE). At the upper cap for scaling at a CVwR ~21.42% the sample size is essentially driven by the F-test (if CVwT > CVwR). Below a comparison with conventional ABE and ABEL for swT/swR-ratios of 23, 1, and 32. In case you speak R:

library(PowerTOST)
x <- as.data.frame(matrix(NA, ncol=12, nrow=3))
names(x)  <- c("CVwR", "CVwT", "CVw", "CV.r", "s.r", "n.RSABE", "n.ABE",
               "pw.ABE", "pw.ABE.n2", "n.ABEL", "pw.ABEL", "pw.ABEL.n2")
x["s.r"]  <- c(2/3, 1, 3/2)
x["CVwR"] <- rep(se2CV(0.21179), 3)
x["CVwT"] <- se2CV(x["s.r"]*CV2se(x["CVwR"]))
x["CV.r"] <- x["CVwT"]/x["CVwR"]
x["CVw"]  <- as.numeric(mse2CV((CV2mse(x["CVwT"])+CV2mse(x["CVwR"]))/2))
for (j in 1:3) { # loops for functions which don't vectorize
  x[j, "n.RSABE"] <- sampleN.NTIDFDA(CV=c(x[j, "CVwT"], x[j, "CVwR"]),
                                     theta0=0.95, design="2x2x4",
                                     print=FALSE, details=FALSE)[["Sample size"]]
  y <- sampleN.TOST(CV=x[j, "CVw"], theta0=0.95, design="2x2x4",
                    print=FALSE, details=FALSE)
  x[j, "n.ABE"]  <- y[["Sample size"]]
  x[j, "pw.ABE"] <- y[["Achieved power"]]
  y <- sampleN.scABEL(CV=c(x[j, "CVwT"], x[j, "CVwR"]), theta0=0.95,
                      design="2x2x4", print=FALSE, details=FALSE)
  x[j, "n.ABEL"]  <- y[["Sample size"]]
  x[j, "pw.ABEL"] <- y[["Achieved power"]]
}
for (j in 1:3) { # use the sample size for CVwT = CVwR (2nd row)
  x[j, "pw.ABE.n2"]  <- power.TOST(CV=x[j, "CVw"], theta0=0.95,
                                   design="2x2x4", n=x[2, "n.ABE"])
  x[j, "pw.ABEL.n2"] <- power.scABEL(CV=c(x[j, "CVwT"], x[j, "CVwR"]),
                                     theta0=0.95, design="2x2x4",
                                     n=x[2, "n.ABEL"])
}
print(round(x, 4), row.names=FALSE)

We get:
   CVwR   CVwT    CVw   CV.r    s.r n.RSABE n.ABE pw.ABE pw.ABE.n2 n.ABEL pw.ABEL pw.ABEL.n2
 0.2142 0.1419 0.1815 0.6625 0.6667      14     8 0.8262
    0.9440      8  0.8355     0.9460
 0.2142 0.2142 0.2142 1.0000 1.0000      18    12 0.8626    0.8626     12  0.8663     0.8663
 0.2142 0.3259 0.2750 1.5214 1.5000      32    18 0.8413    0.6602     18  0.8424     0.6649

Higher sample sizes due to the comparison of variabilities. Equal sample sizes for ABE and the EMA’s ABEL since no scaling is allowed (CVwR <30%).

Now for CVwR 30%:
 CVwR   CVwT    CVw   CV.r    s.r n.RSABE n.ABE pw.ABE pw.ABE.n2 n.ABEL pw.ABEL pw.ABEL.n2
  0.3 0.1976 0.2534 0.6587 0.6667      16    14 0.8040   
0.9183     12  0.8035     0.9227
  0.3 0.3000 0.3000 1.0000 1.0000      22    20 0.8202    0.8202     18  0.8276     0.8276
  0.3 0.4626 0.3877 1.5419 1.5000      40    32 0.8181    0.5941     28  0.8052     0.6040

Same behavior of sample sizes RSABE vs. ABE. Now – due to scaling – ABEL performs better than ABE. Note also the influence of the additional knowledge of CVwT. If you assume that CVwT = CVwR (only information from a 2×2×2 crossover) you would plan with a sample size 18 for ABEL.

We could also vary both CVwR and CVwT by looking at extreme swT/swR-ratios which give still the same pooled CVw of 30%:
   CVwR   CVwT CVw   CV.r    s.r n.RSABE n.ABE pw.ABE pw.ABE.n2 n.ABEL pw.ABEL pw.ABEL.n2
 0.3560 0.2333 0.3 0.6553 0.6667      22    20 0.8202    0.8202     14  0.8172     
0.8951
 0.3000 0.3000 0.3 1.0000 1.0000      22    20 0.8202    0.8202     18  0.8276     0.8276
 0.2334 0.3560 0.3 1.5253 1.5000      32    20 0.8202    0.8202     20  0.8228     0.7815

Same pattern. This is a bad case scenario when fishing in the dark (i.e., knowing only CVw). The last column is related to the list at the beginning of the post.

In my experience only PPIs (the *prazole-family) in many cases show a higher variability of the reference due to bad gastroresistant coating. Then you will see subjects with low concentrations (which can be removed under certain conditions stated in the MR-GL, Section 6.2.3). We also have to demonstrate that the high CVwR is not caused by outliers. See the infamous (likely fabricated) reference data set I of the Q&A-document. swT∕swR 0.7647 (still within 0.667–1.500). After removal of two outliers (subjects 45, 52) we get swT∕swR 1.0881. Interesting.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
zizou
★    

Plzeň, Czech Republic,
2017-07-08 17:29
(2477 d 08:16 ago)

@ Helmut
Posting: # 17523
Views: 6,935
 

 CVwR ~ CVwT ~ CVw?

Hi everybody and nobody!

❝ I agree with ElMaestro. If you design a replicate study based on the CVw of a 2×2×2 crossover you have to assume that CVwT = CVwR.


Practically I agree with everyone of you.
Nevertheless I have one nitpicker's question.
We have to assume CVwT = CVwR = CVw? (If we design a replicate study for EMA based on the CVw of a 2×2×2 crossover.)
Or CVw can't differ when CVwT = CVwR?
Theoretically I can imagine (no real) data of 2x2x4 replicate study - see following idea:
After Test treatment we will have almost the same values for each individual subject (e.g. subject No. 1 value 0.9501 after the first T and 0.9502 after the second T (before ln-transformation)) - for simplicity geometric mean of T values close to 0.95.
In the same way after Reference treatment we will have almost the same values for each individual subject (e.g. subject No. 1 value (1/0.9501) after the first R and (1/0.9502) after the second R (before ln-transformation) - If all the values of R are reciprocal values of T we should get the same intra-subject CV for T and R.) - for simplicity geometric mean of R values equal to (1/0.95).
So GMR T/R of all "pooled" data will be close to 0.95/(1/0.95)=0.95^2=0.9025 and CVw used for CI calculation will be higher than CVwT and CVwR?
I know, in this kind of example the CVwT and CVwR are extremely low, but the point is that theoretically the CVw observed in 2x2 study can be higher than 30% (so we can try replicate design with possible widening of BE acceptance criteria) but the CVwT and CVwR (which we don't know from 2x2) can be lower.
Fact or Fiction?

Best regards,
zizou
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2017-07-08 18:45
(2477 d 07:00 ago)

@ zizou
Posting: # 17524
Views: 7,015
 

 CVwR = CVwT < CVw‽

Hi Zizou,

❝ Practically I agree with everyone of you.

❝ Nevertheless I have one nitpicker's question.


Being a nitpicker myself I love it.

❝ We have to assume CVwT = CVwR = CVw? (If we design a replicate study for EMA based on the CVw of a 2×2×2 crossover.)

❝ Or CVw can't differ when CVwT = CVwR?

❝ Theoretically I can imagine (no real) data of 2x2x4 replicate study - see following idea: […]



See my simulated data. I hope I got it right.

subject sequence period treatment  data
   1      TRTR     1        T     0.90180
   1      TRTR     2        R     1.10889
   1      TRTR     3        T     0.94490
   1      TRTR     4        R     1.05831
   2      TRTR     1        T     0.96150
   2      TRTR     2        R     1.04004
   2      TRTR     3        T     0.95560
   2      TRTR     4        R     1.04646
   3      TRTR     1        T     0.99610
   3      TRTR     2        R     1.00392
   3      TRTR     3        T     0.97080
   3      TRTR     4        R     1.03008
   4      TRTR     1        T     0.96760
   4      TRTR     2        R     1.03348
   4      TRTR     3        T     0.99630
   4      TRTR     4        R     1.00371
   5      TRTR     1        T     0.96270
   5      TRTR     2        R     1.03875
   5      TRTR     3        T     0.91270
   5      TRTR     4        R     1.09565
   6      TRTR     1        T     0.91130
   6      TRTR     2        R     1.09733
   6      TRTR     3        T     0.94770
   6      TRTR     4        R     1.05519
   7      RTRT     1        R     1.03530
   7      RTRT     2        T     0.96590
   7      RTRT     3        R     1.05250
   7      RTRT     4        T     0.95012
   8      RTRT     1        R     1.07220
   8      RTRT     2        T     0.93266
   8      RTRT     3        R     1.02920
   8      RTRT     4        T     0.97163
   9      RTRT     1        R     1.04180
   9      RTRT     2        T     0.95988
   9      RTRT     3        R     1.07010
   9      RTRT     4        T     0.93449
  10      RTRT     1        R     1.00270
  10      RTRT     2        T     0.99731
  10      RTRT     3        R     1.09080
  10      RTRT     4        T     0.91676
  11      RTRT     1        R     1.00190
  11      RTRT     2        T     0.99810
  11      RTRT     3        R     1.08120
  11      RTRT     4        T     0.92490
  12      RTRT     1        R     1.01090
  12      RTRT     2        T     0.98922
  12      RTRT     3        R     1.06340
  12      RTRT     4        T     0.94038


❝ So GMR T/R of all "pooled" data will be close to 0.95/(1/0.95)=0.95^2=0.9025 …


GLSM 0.9542 (T), 1.0480 (R), PE 0.9104.

❝ … and CVw used for CI calculation will be higher than CVwT and CVwR?


CVs (calculated by the EMA’s fixed effects “Method A”): CVwT = CVwR = 3.04%, CVw = 3.42%. Pooled CV larger the ones of T and R. Simply amazing!

❝ I know, in this kind of example the CVwT and CVwR are extremely low, but the point is that theoretically the CVw observed in 2x2 study can be higher than 30% (so we can try replicate design with possible widening of BE acceptance criteria) but the CVwT and CVwR (which we don't know from 2x2) can be lower.

❝ Fact or Fiction?


The former?

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
mittyri
★★  

Russia,
2017-07-09 01:47
(2476 d 23:58 ago)

@ Helmut
Posting: # 17526
Views: 6,919
 

 'whatif' plotting from 4X2 to 2X2

Dear Helmut, Dear Zizou,

thank you for pointing out this interesting case!
I tried to write some code to see what could we expect with given variabilities from full replicate dataset.
May be it is not so fair since I'm sampling from given data, not from CV directly.
By the way some clues to the question 'what if'

library(dplyr)
library(ggplot2)
alpha <- 0.05
nsims <- 1000

Sample2from4 <- function(Data, datacol = "data", nsims = 1000, alpha = 0.05) {
  # columns should be: subject, sequence, period, treatment; data column could be changed in function args
  ow    <- options()
  options(contrasts = c("contr.treatment", "contr.poly"), digits = 12)
  Data <- as.data.frame(Data)
  Data$subject <- as.factor(Data$subject)
  Data$sequence <- as.factor(Data$sequence)
  Data$period <- as.numeric(Data$period)
  Data$treatment <- as.factor(Data$treatment)
  Data$data <- as.numeric(Data[, match(datacol, colnames(Data))])
  SamplingResults <-
    data.frame(
      PE = numeric(0),
      lowerCI = numeric(0),
      upperCI = numeric(0),
      CV = numeric(0))
  for (i in 1:nsims) {
    # sampling N          1   2  3   4
    # periods to include 12 34  14  23
    Data2periods <-
      Data %>%
      group_by(subject) %>%
      mutate(periodsample = sample(1:4, 1)) %>%
      filter((periodsample == 1 & (period == 1 | period == 2)) |
               (periodsample == 2 & (period == 3 | period == 4)) |
               (periodsample == 3 & (period == 1 | period == 4)) |
               (periodsample == 4 & (period == 2 | period == 3))
      ) %>%
      mutate(periodrank = rank(period))
    Data2periods$period <- as.factor(Data2periods$periodrank)
    mod <- lm(log(data) ~ sequence + treatment + period + subject %in% sequence,
         data = Data2periods)
    PE <- as.numeric(exp(coef(mod)["treatmentT"]))
    CI <- exp(confint(mod, "treatmentT", level = 1 - 2 * alpha))
    mse <- summary(mod)$sigma ^ 2
    CV <- sqrt(exp(mse) - 1)
    SamplingResults <-
      rbind(SamplingResults,
            data.frame(
              PE = PE,
              lowerCI = CI[1],
              upperCI = CI[2],
              CV = CV))
  }
  options(ow) # restore options
  return(SamplingResults)
}

plotSample2from4 <- function(SamplingResults, Data, datacol = "data"){
  ow    <- options()
  options(contrasts = c("contr.treatment", "contr.poly"), digits = 12)
  Data <- as.data.frame(Data)
  Data$subject <- as.factor(Data$subject)
  Data$sequence <- as.factor(Data$sequence)
  Data$period <- as.factor(Data$period)
  Data$treatment <- as.factor(Data$treatment)
  Data$data <- as.numeric(Data[, match(datacol, colnames(Data))])
  Data.modR <- lm(log(data) ~ sequence + subject%in%sequence + period,
                  data=Data[Data$treatment=="R",])
  Data.CVwR  <- sqrt(exp(summary(Data.modR)$sigma ^ 2) - 1)
  Data.modT <- lm(log(data) ~ sequence + subject%in%sequence + period,
                  data=Data[Data$treatment=="T",])
  Data.CVwT  <- sqrt(exp(summary(Data.modT)$sigma ^ 2) - 1)
  Data.CV <- cbind.data.frame(CV = c(Data.CVwR, Data.CVwT), CVw = c("CVwR", "CVwT") )
  MoreThanCVwR <- length(SamplingResults$CV[SamplingResults$CV<Data.CVwR]) / length(SamplingResults$CV)
  MoreThanCVwT <- length(SamplingResults$CV[SamplingResults$CV<Data.CVwT]) / length(SamplingResults$CV)
  Data.CV <- cbind.data.frame(Data.CV, MoreThanCVw = c(MoreThanCVwR, MoreThanCVwT))
  ribbon <- ggplot_build(ggplot() + geom_density(data = SamplingResults, aes(x = CV), colour = "<CVw>"))$data[[1]]
  ribbon$colour[ribbon$x<Data.CVwR & ribbon$x < Data.CVwT ] <- "<CVwR & <CVwT"
  ribbon$colour[ribbon$x>Data.CVwR & ribbon$x > Data.CVwT ] <- ">CVwR & >CVwT"
  title <- paste0("CVwR = ", sprintf("%.2f%%", Data.CVwR*100),
                  ", CVwT = ", sprintf("%.2f%%", Data.CVwT*100),
                  "; \n2X2 Study CV > CVwR in ", sprintf("%.2f%%", 100-MoreThanCVwR*100),
                  "; 2X2 Study CV > CVwT in ", sprintf("%.2f%%", 100-MoreThanCVwT*100))
  densityplot <-
    ggplot() +
    geom_ribbon(data = ribbon, aes(x = x, ymin = 0, ymax = y, fill = colour), alpha = .4) +
    geom_vline(data = Data.CV,  aes(xintercept = CV, color = CVw, linetype = CVw)) +
    theme_bw() +
    ggtitle(title)+
    labs(x="CV", y = "density")
  print(densityplot)
  return(Data.CV)
  options(ow) # restore options
}


now using the dataset from Helmut's post above:
SamplingResults <- Sample2from4(Data = Data, datacol = "data", nsims = 1000)
plotSample2from4(Data = Data, SamplingResults = SamplingResults, datacol = "data")

[image]
Amazing! CVwR and CVwT are far away from mean/median!

What about other examples? EMA full replicate:
SamplingResults <- Sample2from4(Data = ds01, datacol = "PK", nsims = 1000)
plotSample2from4(Data = ds01, SamplingResults = SamplingResults, datacol = "PK")

[image]

Looks reasonable. Somewhere between CVwR and CVwT. By the way I would not go with pooled CV for 2X2 knowing this...

Kind regards,
Mittyri
ElMaestro
★★★

Denmark,
2017-07-09 02:16
(2476 d 23:30 ago)

@ zizou
Posting: # 17527
Views: 7,046
 

 CVwR ~ CVwT ~ CVw?

Hello zizou,

You lost me.
I am sure your point may be good but I don't understand it. Could you reformulate these parts:

❝ Theoretically I can imagine (no real) data of 2x2x4 replicate study - see following idea:

❝ After Test treatment we will have almost the same values for each individual subject (e.g. subject No. 1 value 0.9501 after the first T and 0.9502 after the second T (before ln-transformation)) - for simplicity geometric mean of T values close to 0.95.

❝ In the same way after Reference treatment we will have almost the same values for each individual subject (e.g. subject No. 1 value (1/0.9501) after the first R and (1/0.9502) after the second R (before ln-transformation) - If all the values of R are reciprocal values of T we should get the same intra-subject CV for T and R.) - for simplicity geometric mean of R values equal to (1/0.95).


I am lost here.

❝ So GMR T/R of all "pooled" data will be close to 0.95/(1/0.95)=0.95^2=0.9025 and CVw used for CI calculation will be higher than CVwT and CVwR?


Does that follow automatically? How? Why? CVw used for CI calculation is a dangerous term when we talk EMA?
I tried to look at Helmut's simulated data and could still not figure it out. :confused:

❝ I know, in this kind of example the CVwT and CVwR are extremely low, but the point is that theoretically the CVw observed in 2x2 study can be higher than 30% (so we can try replicate design with possible widening of BE acceptance criteria) but the CVwT and CVwR (which we don't know from 2x2) can be lower.


Imagine an old production process for R and a new process for T. This is real life.
Unit-to-unit variability for T is low, for R it is high. When you do a replicated study you see the effect directly on sWR, sWT. It is completely obscured for a 222BE trial. Hence it is hardly surprising that you'll be seeing little difference here and there between the RMSE from a 222BE trial and then the ones derived from the EMA method where replication is involved.

Pass or fail!
ElMaestro
zizou
★    

Plzeň, Czech Republic,
2017-07-09 04:13
(2476 d 21:33 ago)

(edited by zizou on 2017-07-09 12:50)
@ ElMaestro
Posting: # 17528
Views: 13,098
 

 CVwR ~ CVwT ~ CVw?

Dear Captain,
I had only an idea with no connection to reality.

❝ ❝ Theoretically I can imagine (no real) data of 2x2x4 replicate study - see following idea:

The idea was to prepare data of T with intra-subject CV close to zero, for example with mean equal to 0.95 with almost zero standard deviation.
Data of R with the same properties but shifted to different mean, for example 1.05. So I expected intra-subject CV for T data close to zero and intra-subject CV for R data close to zero also. But when we pool the T and R data together, there will be higher differences in the data, so I expected the higher intra-subject CV for all (pooled) data than for T or R data separately.
So the point was completely in theoretical way with the description of example which provides the proof of higher CVw.
Since Helmut simulated better data. - Thanks! - The description is not needed now.

❝ ❝ So GMR T/R of all "pooled" data will be close to 0.95/(1/0.95)=0.95^2=0.9025 and CVw used for CI calculation will be higher than CVwT and CVwR?

❝ Does that follow automatically? How? Why? CVw used for CI calculation is a dangerous term when we talk EMA?

❝ I tried to look at Helmut's simulated data and could still not figure it out. :confused:

The statements are valid only for the crazy example which I tried to describe swiftly by guesswork. So that doesn't follow automatically. Or maybe when CVwT = CVwR and means of T and R differ, the CVw will be probably slightly higher? So the equality CVwT = CVwR = CVw with the infinite precision is only in a really special case of data. (added by editing)
Helmut's simulated data are much more normal than data in my mind. So I suggest to look only on Helmut's data as an example of data where CVw is higher than CVwT and CVwR.

Best regards,
zizou

Note (to be deleted): typo error in Helmut's simulated data summary

❝ GLSM 0.9542 (T), 1.0480 (R), PE 0.9104.



Edit: THX, done. [Helmut]
UA Flag
Activity
 Admin contact
22,988 posts in 4,825 threads, 1,654 registered users;
89 visitors (0 registered, 89 guests [including 2 identified bots]).
Forum time: 01:46 CEST (Europe/Vienna)

The only way to comprehend what mathematicians mean by Infinity
is to contemplate the extent of human stupidity.    Voltaire

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5