partial AUCs [R for BE/BA]

posted by Helmut Homepage – Vienna, Austria, 2020-11-11 02:26 (17 d 17:41 ago) – Posting: # 22064
Views: 369

Dear all,

following this thread a 315 LOC [image] script at the end (a.k.a. Spaghetti Viennese).
Imputations according to the chosen trapezoidal rule are denoted in the output with @ and values used in the estimation of λz with * (only if the last concentration is missing or BQL). Agreement with results obtained in Phoe­nix / WinNonlin denoted by ✔️.

Using the data of the aforementioned thread (some of the defaults applied: no deviations from scheduled sampling times (internally: t <- s), linear-up/log-down trapezoidal, plot with linear y-axis, partial AUCs colored):

s <- c(0  , 0.5, 1,  1.25,  1.5,  1.75,  2,  2.25, 2.5,
       3  , 3.5, 4,  4.5 ,  5  ,  5.5 ,  6,  6.5 , 7 ,
       7.5, 8  , 9, 12   , 24  , 36   , 48, 72)          # scheduled sampling times
C <- c( 0  ,    0,  7.23, 10.1, 12.8, 17.6, 21.5, 18.6, 20.4,
       18.6, 18.6, 19.8 , 22.8, 27.3, 34.0, 40.0, 38.0, 37.3,
       36.0, 34.9, 35.3 , 28.3, 18.0, 11.1,  7.01, 3.03) # concentrations
#####################
# call the function #
#####################

x <- profile(s = s, C = C, cut.off = 4, end.time = 72, x.tick = 12)
print(x$data, row.names = FALSE); print(x$AUC, row.names = FALSE)

[image]

 sched.  time conc. rule  delta.AUC     AUC.t
   0.00  0.00  0.00        0.000000   0.00000
   0.50  0.50  0.00  lin   0.000000   0.00000
   1.00  1.00  7.23  lin   1.807500   1.80750
   1.25  1.25 10.10  lin   2.166250   3.97375
   1.50  1.50 12.80  lin   2.862500   6.83625
   1.75  1.75 17.60  lin   3.800000  10.63625
   2.00  2.00 21.50  lin   4.887500  15.52375
   2.25  2.25 18.60  log   5.003749  20.52750
   2.50  2.50 20.40  lin   4.875000  25.40250
   3.00  3.00 18.60  log   9.743073  35.14557
   3.50  3.50 18.60  lin   9.300000  44.44557
   4.00  4.00 19.80  lin   9.600000  54.04557
   4.50  4.50 22.80  lin  10.650000  64.69557
   5.00  5.00 27.30  lin  12.525000  77.22057
   5.50  5.50 34.00  lin  15.325000  92.54557
   6.00  6.00 40.00  lin  18.500000 111.04557
   6.50  6.50 38.00  log  19.495726 130.54130
   7.00  7.00 37.30  log  18.824458 149.36576
   7.50  7.50 36.30  log  18.398868 167.76462
   8.00  8.00 34.90  log  17.797706 185.56233
   9.00  9.00 35.30  lin  35.100000 220.66233
  12.00 12.00 28.30  log  95.013528 315.67586
  24.00 24.00 18.00  log 273.155180 588.83104
  36.00 36.00 11.10  log 171.277277 760.10831
  48.00 48.00  7.01  log 106.786791 866.89510
  72.00 72.00  3.03  log 113.880350 980.77545
 AUC.last  pAUC0-4 pAUC4-72
 980.7755 54.04557 926.7299
✔️


Two delayed sampling times introduced, plot suppressed.

t <- s
t[s == 4 ] <- t[s == 4 ] +  5/60
t[s == 72] <- t[s == 72] + 15/60
x <- profile(s = s, t = t, C = C, cut.off = 4, end.time = 72,
             type = "linlog", plot.it = FALSE)
print(x$data, row.names = FALSE); print(x$AUC, row.names = FALSE)

 sched.      time      conc. rule   delta.AUC     AUC.t
   0.00  0.000000  0.00000          0.0000000   0.00000
   0.50  0.500000  0.00000    lin   0.0000000   0.00000
   1.00  1.000000  7.23000    lin   1.8075000   1.80750
   1.25  1.250000 10.10000    lin   2.1662500   3.97375
   1.50  1.500000 12.80000    lin   2.8625000   6.83625
   1.75  1.750000 17.60000    lin   3.8000000  10.63625
   2.00  2.000000 21.50000    lin   4.8875000  15.52375
   2.25  2.250000 18.60000    log   5.0037492  20.52750
   2.50  2.500000 20.40000    lin   4.8750000  25.40250
   3.00  3.000000 18.60000    log   9.7430730  35.14557
   3.50  3.500000 18.60000    lin   9.3000000  44.44557
   4.00  4.000000 19.62857 @  lin   9.5571429  54.00272
   4.00  4.083333 19.80000    lin   1.6428571  55.64557
   4.50  4.500000 22.80000    lin   8.8750000  64.52057
   5.00  5.000000 27.30000    lin  12.5250000  77.04557
   5.50  5.500000 34.00000    lin  15.3250000  92.37057
   6.00  6.000000 40.00000    lin  18.5000000 110.87057
   6.50  6.500000 38.00000    log  19.4957257 130.36630
   7.00  7.000000 37.30000    log  18.8244577 149.19076
   7.50  7.500000 36.30000    log  18.3988677 167.58962
   8.00  8.000000 34.90000    log  17.7977058 185.38733
   9.00  9.000000 35.30000    lin  35.1000000 220.48733
  12.00 12.000000 28.30000    log  95.0135275 315.50086
  24.00 24.000000 18.00000    log 273.1551796 588.65604
  36.00 36.000000 11.10000    log 171.2772767 759.93331
  48.00 48.000000  7.01000    log 106.7867907 866.72010
  72.00 72.000000  3.05631 @  log 114.3058192 981.02592
  72.00 72.250000  3.03000    log   0.7607846 981.78671
 AUC.last  pAUC0-4 pAUC4-72
 981.7867 54.00272 927.0232
✔️


The script handles both delays and too early sampling – though I never understood why the latter can happen (see there).
If you have missings and/or BQLs, give them as NA in the C-vector. Original data set, concentration at 36 h “missing”.

x <- (..., type = "linlog")
 AUC.last  pAUC0-4 pAUC4-72
 982.4044 54.04557 928.3588
 ✔️


x <- (..., type = "lin")
 AUC.last  pAUC0-4 pAUC4-72
 1014.486 54.06125  960.425
✔️


With the lin-up/log-down rule results are close to the ones of the original data set (biases +0.17% and +0.18% for AUC.last and pAUC4-72), whereas with the linear trapezoidal you get relevant ones (+3.4% and +3.5%).

If you have NAs at the end of the profile, an extrapolation based on the “Best Fit” algorithm* is performed. Original data set, concentration at 72 h “BQL”, part of the output:

 sched.  time      conc. rule  delta.AUC     AUC.t
  12.00 12.00 28.30000 *  log  95.013528 315.67586
  24.00 24.00 18.00000 *  log 273.155180 588.83104
  36.00 36.00 11.10000 *  log 171.277277 760.10831
  48.00 48.00  7.01000 *  log 106.786791 866.89510
  72.00 72.00  2.75214 @  log 109.297497 976.19260
 AUC.last  pAUC0-4 pAUC4-72
 866.8951 54.04557  922.147
✔️

Not that bad. Bias –9.2% for the estimated C72 and –0.49% for pAUC4-72. Show how the fitting was done:

print(x$fit, row.names = FALSE)
   method n lower upper   intcpt   lambda.z   Rsq.adj      C72
 Best Fit 4    12    48 3.814379 0.03891666 0.9997764 2.752142
✔️

However, the algo is “greedy” and regularly fails with flat profiles and for multiphasic products. You can also specify your own limits:

x <- profile(..., lower = 24, upper = 48)
 sched.  time      conc. rule  delta.AUC     AUC.t
  12.00 12.00 28.30000    log  95.013528 315.67586
  24.00 24.00 18.00000 *  log 273.155180 588.83104
  36.00 36.00 11.10000 *  log 171.277277 760.10831
  48.00 48.00  7.01000 *  log 106.786791 866.89510
  72.00 72.00  2.71919 @  log 108.742376 975.63748
 AUC.last  pAUC0-4 pAUC4-72
 866.8951 54.04557 921.5919
✔️
 method n lower upper   intcpt   lambda.z   Rsq.adj      C72
   User 3    24    48 3.829436 0.03929309 0.9995748 2.719189
✔️

For the “Best Fit” algo at least three concentrations after tmax are required. At least two concentrations are required for own limits (where tmax may be included).
If you have increasing concentrations giving a negative λz, the result will be nonsense and a warning thrown.

BTW, in all software packages I know of points are connected by straight lines, regardless whether the y-axis is linear or logarithmic. Distribution & elimination are first order processes (exponentially decreasing). The 4th plot demonstrates why the linear trapezoidal rule has a positive bias. :-D

[image] [image]

[image] [image]

I like this one with two missings. :-D

[image]

Bias of pAUC4-72 +12.7% (with the lin-up/log-down only +1.2%)




impute <- function(x, y, x0, type = "linlog") {
  # inter- or extrapolate at x0 for given vectors x, y
  if (any(length(x) != 2, length(y) != 2))
    stop("Exactly two times and concentrations required.")
  if (missing(x0)) stop("x0 must be given.")
  if (x0 < 0) stop("x0 must be >= 0.")
  if (!type %in% c("linlog", "lin"))
    stop("type has to be 'linlog' or 'lin'.")
  if (type == "linlog") {
    if (y[2] < y[1]) {  # log-down
      y0 <- exp((log(y[1])*(x[2]-x0)+log(y[2])*(x0-x[1]))/(x[2]-x[1]))
    } else {            # linear-up
      y0 <- (y[1]*(x[2]-x0)+y[2]*(x0-x[1]))/(x[2]-x[1])
    }
  } else {              # linear
    y0 <- (y[1]*(x[2]-x0)+y[2]*(x0-x[1]))/(x[2]-x[1])
  }
  return(y0)
}

lambda.z <- function(x, end.time, lower = NA, upper = NA) {
  # "Best Fit" algorithm
  # User limits specified for scheduled times (more convenient)

  if (any(is.na(lower), is.na(upper))) {
    method <- "Best Fit"
    if (nrow(x) < 3)
      message("At least 3 times/concentrations after tmax required for extrapolation.")
  } else {
    if (upper <= lower)
      stop("upper has to be > lower.")
    x <- x[x$s >= lower & x$s <= upper, ]
    if (nrow(x) < 2)
      message("At least 2 times/concentrations required for extrapolation.")
    method <- "User"
  }
  l.z <- intcpt <- Rsq.adj <- numeric()
  m   <- NULL
  if (method == "Best Fit") {
    j <- 2
    k <- 1
    repeat { # backwards from last to first after tmax
      tmp        <- x[(nrow(x)-j):(nrow(x)), ]
      if (tmp$t[1] == x$t[x$C == max(x$C, na.rm = TRUE)])
        break # don't use tmax
      m          <- lm(log(tmp$C) ~ tmp$t)
      l.z[k]     <- as.numeric(-coef(m)[2])
      intcpt[k]  <- as.numeric(coef(m)[1])
      Rsq.adj[k] <- summary(m)[["adj.r.squared"]]
      j <- j+1
      k <- k+1
      if (k >= 3) {
        if ((Rsq.adj[k-1] < Rsq.adj[k-2]) |
           ((Rsq.adj[k-1] - Rsq.adj[k-2]) <= 0.0001)) break
      }
    }
  } else {
    tmp     <- x
    m       <- lm(log(tmp$C) ~ tmp$t)
    l.z     <- as.numeric(-coef(m)[2])
    intcpt  <- as.numeric(coef(m)[1])
    Rsq.adj <- summary(m)[["adj.r.squared"]]
  }
  # prepare results
  if (is.null(m)) { # fit failed
    res <- NA
  } else {
    if (method == "Best Fit") {
      n        <- nrow(tmp)-1
      lower    <- signif(tmp$t[2], 5)
      ifelse(length(intcpt) > 1,
        intcpt <- intcpt[length(intcpt)-1], intcpt <- intcpt)
      ifelse (length(l.z) > 1,
        lambda.z <- l.z[length(l.z)-1], lambda.z <- l.z)
      ifelse (length(Rsq.adj) > 1,
        Rsq.adj <- Rsq.adj[length(Rsq.adj)-1], Rsq.adj <- Rsq.adj)
    } else {
      n        <- nrow(tmp)
      lower    <- signif(tmp$t[1], 5)
      lambda.z <- l.z
    }
    upper <- signif(tmp$t[nrow(tmp)], 5)
    res   <- data.frame(method = method, n = n,
                        lower = lower, upper = upper,
                        intcpt = intcpt, lambda.z = lambda.z,
                        Rsq.adj = Rsq.adj)
    res$est       <- exp(res$intcpt-res$lambda.z*end.time)
    names(res)[8] <- paste0("C", end.time)
    if (res$lambda.z <= 0)
      warning("No reasonable extrapolation.")
    }
    if (method == "Best Fit" & is.null(m))
      message("'Best Fit' failed, try own limits.")
  return(res)
}

AUC.calc <- function(x, type = "linlog") {
  x   <- as.data.frame(lapply(x, as.numeric))
  res <- cbind(x, rule = "", d = 0, AUC.t = 0)
  d.t <- c(0, diff(x$t))         # sampling intervals
  d   <- numeric(nrow(x))        # AUC within each sampling interval
  for (j in 1:(nrow(x)-1)) {
    if (type == "linlog") {      # linear-up / log-down
      if (x$C[j+1] >= x$C[j]) {  # up
        res$rule[j+1] <- "lin"
        res$d[j+1] <- 0.5*d.t[j+1]*(x$C[j+1]+x$C[j])
      } else {                   # down
        res$rule[j+1] <- "log"
        res$d[j+1] <- d.t[j+1]*(x$C[j+1]-x$C[j])/log(x$C[j+1]/x$C[j])
      }
    } else {                     # linear
      res$rule[j+1] <- "lin"
      res$d[j+1] <- 0.5*d.t[j+1]*(x$C[j+1]+x$C[j])
    }
  }
  res$AUC.t     <- cumsum(res$d) # cumulative AUC
  names(res)[4] <- "delta.AUC"
  return(res[, 3:5])
}

profile <- function(s, t, C, t.unit = "h", C.unit = "ng/mL",
                    col1 = "red",  bg1 = rgb(1    , 0.533, 0.533, 0.36),
                    col2 = "blue", bg2 = rgb(0.533, 0.533, 1    , 0.36),
                    cut.off = 0, end.time, lower = NA, upper = NA,
                    type = "linlog", y.ax = "lin", x.tick = NA,
                    plot.it = TRUE, plot.area = TRUE) {
  # Only one cut-off and end time supported so far
  # RGB background colors; last component is the alpha-channel (transparency of
  # points and areas). Each component has to be within [0, 1].
  # lower:     User specified lower limit for lambda.z estimation (optional)
  # upper:     User specified uper limit for lambda.z estimation (optional)
  # type:      "linlog" for linear-up / log-down trapezoidal method (default)
  #            "lin" for linear trapezoidal method (not recommended)
  # y.ax:      "lin" linear y-axis (default)
  #            "log" logarithmic y-axis
  # x.tick:    Spacing of x-axis (optional)
  # plot.it:   Boolean (should data be plotted?), defaults to TRUE
  # plot.area: Boolean (should areas be plotted?), defaults to TRUE

  if (missing(t)) t <- s
  if (length(cut.off) > 1)
    stop ("Currently only one cut-off time supported.")
  if (missing(end.time))
    stop("end.time has to be given.")
  if (!type %in% c("linlog", "lin"))
    stop("type has to be 'linlog' or 'lin'.")
  if (!y.ax %in% c("lin", "log"))
    stop("y.ax has to be 'lin' or 'log'.")
  if (is.na(x.tick)) x.tick <- unique(diff(pretty(t)))
  data    <- data.frame(s = s, t = t, C = C)
  data    <- data[complete.cases(data), ]    # remove rows with NAs
  AUC     <- data                            # result data.frame
  devs    <- data$t[which(data$t != data$s)] # deviating sampling times
  imp     <- data$s[which(data$t %in% devs)] # scheduled ones
  if (length(imp) > 0) {
    imputed <- data.frame(s = imp, t = imp, C = NA)
    AUC     <- rbind(AUC, imputed)           # add row for imputations
    AUC     <- cbind(AUC, imp = "")          # add column for imputations
    AUC     <- AUC[with(AUC, order(s, t)), ]
    for (j in 1:nrow(imputed)) {
      if (imp[j] < tail(AUC$t, 1)) { # all but too early last sample
        ind <- which(AUC$t == imp[j])+c(-1, +1)
        x   <- AUC$t[ind]
        x0  <- imp[j]
        y   <- AUC$C[ind]
        AUC$C[which(AUC$t == imp[j])]   <- impute(x, y, x0, type = type)
        AUC$imp[which(AUC$t == imp[j])] <- "*"
      } else {                       # special case
        tmp <- AUC[(nrow(AUC)-2):(nrow(AUC)-1), 1:3]
        x   <- tmp$t
        x0  <- tmp$s[2]
        y   <- tmp$C
        AUC$C[which(AUC$t == imp[j])]   <- impute(x, y, x0, type = type)
        AUC$imp[which(AUC$t == imp[j])] <- "*"
      }
    }
  }
  est <- FALSE
  if (tail(AUC$t, 1) < end.time) {   # NA concentration at end.time
    tmp <- lambda.z(data, end.time, lower, upper)
    if (any(!is.na(tmp))) {          # only if successful
      if (!"imp" %in% colnames(AUC)) AUC <- cbind(AUC, imp = "")
      AUC <- rbind(AUC, c(s = end.time, t = end.time,
                          C = tmp[[8]], imp = "*"))
      est <- TRUE
    }
  }
  AUCs            <- AUC.calc(AUC[, 2:3], type = type)
  AUC             <- cbind(AUC, AUCs, pAUC = NA)
  early           <- which(AUC$t == cut.off)
  AUC$pAUC[early] <- AUC$AUC.t[early]
  late            <- which(AUC$t == end.time)
  AUC$pAUC[late]  <- AUC$AUC.t[late]-AUC$pAUC[early]
  if (plot.it) {
    ifelse (y.ax == "lin", ylog <- "", ylog <- "y")
    ifelse (type == "linlog",
      main <- "linear-up / logarithmic-down",
      main <- "linear")
    main <- paste(main, "trapezoidal rule")
    t <- as.numeric(AUC$t) # makes our life easier
    C <- as.numeric(AUC$C) # (are chars in the data.frame)
    suppressWarnings(      # for zeros in log-plot
      plot(t, C, type = "n", log = ylog,
           xlab = paste0("time (", t.unit, ")"),
           ylab = paste0("concentration (", C.unit, ")"),
           axes = FALSE, main = main, font.main = 1, cex.main = 1.15))
    if (is.na(x.tick)) {
      axis(1)
    } else {
      axis(1, at = seq(min(t), max(t), x.tick))
    }
    if (y.ax == "lin") {
      abline(h = pretty(C),
             v = seq(min(t), max(t), x.tick),
             lty = 3, col = "lightgrey")
      axis(2, las = 1)
      y0 <- 0
    } else {
      ax.t <- axTicks(2)
      grid(nx = NA, ny = NULL, equilogs = FALSE)
      abline(v = seq(min(t), max(t), x.tick),
             lty = 3, col = "lightgrey")
      axis(2, at = ax.t, las = 1)
      y0 <- min(ax.t)
    }
    if (cut.off > 0) {
      abline(v = cut.off, lty = 2)
      axis(1, at = cut.off, labels = cut.off)
    }
    abline(v = end.time, lty = 2)
    box()
    if (est) {
      fit.x <- seq(tmp$lower, tmp$upper, length.out = 200)
      fit.y <- exp(tmp$intcpt-tmp$lambda.z*fit.x)
      lines(fit.x, fit.y, col = "red", lwd = 4)
    }
    points(data$t[data$t < cut.off], data$C[data$t < cut.off],
           pch = 21, cex = 1.6, col = col1, bg = bg1)
    points(data$t[data$t >= cut.off], data$C[data$t >= cut.off],
           pch = 21, cex = 1.6, col = col2, bg = bg2)
    for (j in 1:(nrow(AUC)-1)) {
      if (C[j+1] >= C[j]) {
        if (t[j+1] <= cut.off) {
          col = col1
          bg = bg1
        } else {
          col = col2
          bg = bg2
        }
        segments(t[j], C[j], t[j+1], C[j+1],
                 col = col, lwd = 2)
        if (plot.area) {
          polygon(c(t[j:(j+1)], t[(j+1):j], t[j]),
                  c(y0, y0, C[(j+1):j], y0), col = bg, border = NA)
        }
      } else {
        ts <- seq(t[j], t[j+1], length.out = 100)
        if (type == "linlog") {
          m  <- lm(log(C[j:(j+1)]) ~ t[j:(j+1)])
          Cs <- exp(coef(m)[1]+coef(m)[2]*ts)
          lines(ts, Cs, col = col, lwd = 2)
        } else {
          m  <- lm(C[j:(j+1)] ~ t[j:(j+1)])
          Cs <- coef(m)[1]+coef(m)[2]*ts
          lines(ts, Cs, col = col, lwd = 2)
        }
        if (plot.area) {
          polygon(c(rev(ts), ts),
                  c(rep(y0, length(ts)), rep(Cs)), col = bg, border = NA)
        }
      }
    }
    rug(data$s, ticksize = -0.01)       # actual sampling times
    rug(data$t[data$t != data$s], ticksize = 0.03,
        side = 3, lwd = 2, col = "red") # deviations from scheduled times
  }
  # cosmetics
  if ("imp" %in% colnames(AUC)) {
    ind  <- which(AUC$imp == "*")
    rows <- 1:nrow(AUC)
    keep <- rows[!rows %in% ind]
    AUC$C[keep] <- sprintf("%.5f  ", as.numeric(AUC$C[keep]))
    AUC$C[ind]  <- sprintf("%.5f @", as.numeric(AUC$C[ind]))
  }
  pAUCs <- AUC$pAUC[!is.na(AUC$pAUC)]
  if ("imp" %in% colnames(AUC)) {
    if (AUC$imp[nrow(AUC)] == "*") {
      AUC.last <- AUC$AUC.t[nrow(AUC)-1]
    } else {
      AUC.last <- AUC$AUC.t[nrow(AUC)]
    }
  } else {
    AUC.last <- AUC$AUC.t[nrow(AUC)]
  }
  AUCs <- data.frame(AUC.last = AUC.last, v1 = pAUCs[1], v2 = pAUCs[2])
  names(AUCs)[2:3] <- c(paste0("pAUC0-", cut.off),
                        paste0("pAUC", cut.off, "-", end.time))
  AUC  <- AUC[, -which(names(AUC) %in% c("imp", "pAUC"))] # no more needed
  if (est)
  AUC$C[as.numeric(AUC$t) >= tmp$lower &
        as.numeric(AUC$t) <= tmp$upper] <- sprintf("%.5f *",
        as.numeric(AUC$C[as.numeric(AUC$t) >= tmp$lower &
        as.numeric(AUC$t) <= tmp$upper]))
  names(AUC)[1:3] <- c("sched.", "time", "conc.")
  AUC  <- cbind(as.data.frame(lapply(AUC[1:2], as.numeric)),
                AUC[, 3:ncol(AUC)])
  if (!est) {
    res <- list(data = AUC, AUC = AUCs)
  } else {
    res <- list(data = AUC, AUC = AUCs, fit = tmp)
  }
  return(res)
}


Dif-tor heh smusma 🖖
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Activity
 Admin contact
21,216 posts in 4,427 threads, 1,482 registered users;
online 14 (0 registered, 14 guests [including 7 identified bots]).
Forum time: Saturday 20:08 CET (Europe/Vienna)

Science may be described as the art of systematic over-simplification –
the art of discerning what we may with advantage omit.    Karl R. Popper

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5