Bioequivalence and Bioavailability Forum • Bust the Buster

Bust the Buster [GxP / QC / QA]

posted by Helmut – Vienna, Austria, 2022-05-26 17:32 (1480 d 20:58 ago) – Posting: # 23023
Views: 19,916

Dear Ohlbe,

I tried to reproduce the ‘Buster routines’ of the paper mentioned above. As usual the [image]

script at the end. Sorry, 200+ lines of code.

This is what I got for my manipulated data set:

A
[image]
Like the paper’s Fig. 1.
We see that for the Test the deviations from the mean increase at the end.
However, statistically the trend is not significant.

B
[image]
Like the paper’s Fig. 2.
For the reference all is good. As expected cause we use the original data.

C
[image]
OK – but the ratios start to rise (by chance) before the manipulation starts with subject 60.
Not convinced whether we can eye-ball it.

D
[image]
Like the paper’s Fig. 3.
Possibly better. The original study was properly powered and by chance would pass with fewer subjects.
After the manipulation it needs almost the complete data set to pass.

E
[image]
Like the paper’s Fig. 5.
Like in A we see rising residuals at the end. However, statistically the trend is not significant.

F
[image]
Like the paper’s Fig. 4.
No clue.

G
[image]
Only for completeness.

For A–C we need only the values of T and R and the subjects (more precisely: in the order of analyses). For the others additionally the randomization.
Potentially A, C, D (and E?) are useful. I have some doubts whether a ‘mild’ manipulation () like in my example can reliably be detected. You can try yet another data set where I multiplied T in the first half by 0.8 and in the second half by 1.25. Then the manipulation is obvious in C, D, and E (trend still not significant; p = 0.1677).

Unfortunately the data set BUa627311.csv is not accessible at doi:10.1016/j.ejps.2020.105595. I suspect that the manipulation was extreme.

library(PowerTOST) # Lazy: convert MSE to CV and back library(randtests) # Provides Wald-Wolfowitz runs test path <- "your.path" file <- "yourdata.csv" url <- NULL # alternatively a valid url enclosed in single of double quotes # url <- "https://bebac.at/downloads/Post23021a.csv" # My original data # url <- "https://bebac.at/downloads/Post23021b.csv" # Manipulated data # Required case-sensitive column names (in any order): # Subject, Sequence, Treatment, Period, Cmax sep <- "," # Column separator: Inspect the file; alternatively "\t" or ";" dec <- "." # Decimal separator; alternatively "," if (sep == dec) stop("sep and dec must must different!") com <- "#" # For eventual comments, sometimes found in the header na <- c("NA", "ND", ".", "", "Missing") # Will be set to NA if (sep == dec) stop("sep and dec must must different!") if (is.null(url)) { # Read local file if (!substr(path, nchar(path), nchar(path)) == "/") path <- paste0(path, "/") # Add trailing forward slash even on Windows data <- read.csv(file = paste0(path, file), sep = sep, dec = dec, quote = "", strip.white = TRUE, comment.char = com, na.strings = na) } else { data <- read.csv(file = url, sep = sep, dec = dec, quote = "", strip.white = TRUE, comment.char = com, na.strings = na) } subjects <- unique(data$Subject) n <- length(subjects) # If there are other columns in the file, drop them modeling <- FALSE modeling <- c("Subject", "Sequence", "Treatment", "Period", "Cmax") if (sum(names(data) %in% modeling) < 5) stop("Check column names!") data <- data[, modeling] modeling <- TRUE if (!modeling) { keep <- c("Subject", "Treatment", "Cmax") if (sum(names(data) %in% keep) < 3) stop("Check column names!") data <- data[, keep] } # Difference of Cmax from mean (both in log-scale) delta.T <- log(data$Cmax[data$Treatment == "T"]) - mean(log(data$Cmax[data$Treatment == "T"])) delta.R <- log(data$Cmax[data$Treatment == "R"]) - mean(log(data$Cmax[data$Treatment == "R"])) # Wald-Wolfowitz runs test of randomness for continuous data # alternative = "left.sided" tests for a trend rt.T <- runs.test(delta.T, alternative = "left.sided") rt.T <- c(paste(rt.T$runs, "runs"), bquote(italic(p) == . (sprintf("%.4f", rt.T$p.value)))) rt.R <- runs.test(delta.R, alternative = "left.sided") rt.R <- c(paste(rt.R$runs, "runs"), bquote(italic(p) == . (sprintf("%.4f", rt.R$p.value)))) # T/R-ratios: individual and cumulative with time Cmax.ratio <- data$Cmax[data$Treatment == "T"] / data$Cmax[data$Treatment == "R"] Cmax.cum <- numeric(n) for (i in 1:n) { Cmax.cum[i] <- prod(Cmax.ratio[1:i])^(1/i) } if (modeling) { # Only if we have Sequence and Period in the data res <- data.frame(subjects = NA_integer_, PE = rep(NA_real_, n-1), lower = NA_real_, upper = NA_real_, MSE = NA_real_, CV = NA_real_, power = NA_real_) # Factorize columns (effects) for the linear model facs <- c("Subject", "Sequence", "Treatment", "Period") data[facs] <- lapply(data[facs], factor) # Find the first occasion where we have at least one subject # in both sequences (otherwise, the model would no work) seqs <- character() for (subj in seq_along(subjects)) { seqs <- c(seqs, unique( as.character(data$Sequence[data$Subject == subjects[subj]]))) if (length(unique(seqs)) > 1) break } ow <- options("digits") options(digits = 12) # more digits for anova on.exit(ow) # ensure that options are reset if an error occurs for (i in subj:n) { tmp <- head(data, i * 2) res$subjects[i-1] <- length(unique(tmp$Subject)) # Acc. to GLs: Nested model is bogus; Subject works as well m <- lm(log(Cmax) ~ Sequence + Subject %in% Sequence + Period + Treatment, data = tmp) res$PE[i-1] <- exp(coef(m)[["TreatmentT"]]) res[i-1, 3:4] <- as.numeric(exp(confint(m, "TreatmentT", level = 1 - 2 * 0.05))) res$MSE[i-1] <- anova(m)[["Residuals", "Mean Sq"]] res$CV[i-1] <- mse2CV(res$MSE[i-1]) res$power[i-1] <- suppressMessages( power.TOST(CV = res$CV[i-1], theta0 = res$PE[i-1], n = length(unique(tmp$Subject)))) } res <- res[complete.cases(res), ] # Raw residuals of T pred <- suppressWarnings( exp(predict(m, newdata = data[data$Treatment == "T", 1:4]))) # Acc. to the paper. Commonly: residual = predicted – observed resid <- data[data$Treatment == "T", "Cmax"] - pred rt.resid <- runs.test(resid, alternative = "left.sided") rt.resid <- c(paste(rt.resid$runs, "runs"), bquote(italic(p) == . (sprintf("%.4f", rt.resid$p.value)))) # Studentized residuals stud <- rstudent(m) # Two values with opposite signs; we need only one of them stud <- stud[c(TRUE, FALSE)] # Uncomment the next line to show PE, CI, MSE, CV, post hoc power # print(round(res, 5)) } # Prepare plots x.tick <- pretty(1:n) x.tick <- c(1, x.tick[2:(length(x.tick)-2)], n) dev.new(width = 4.5 * sqrt(2), height = 4.5, record = TRUE) op <- par(no.readonly = TRUE) par(mar = c(3.2, 3.3, 0.1, 0.1), cex.axis = 0.9, mgp = c(2, 0.5, 0), ljoin = "mitre", lend = "square", ask = TRUE) ############################################### # Difference of log(Cmax) from geometric mean # ############################################### # ylim common to both treatments ylim <- c(-1, +1) * max(abs(range(c(delta.T, delta.R)))) y.ax <- sprintf("%+.1f", pretty(ylim)) y.ax[y.ax == "+0.0"] <- "±0.0" ######## # Test # ######## bp <- barplot(delta.T, plot = FALSE) plot(bp, bp, ylim = ylim, type = "n", axes = FALSE, frame.plot = TRUE, xlab = "Subject", ylab = expression(log(italic(C)[max[T]])*" – mean [ "* log(italic(C)[max[T]])*" ]")) barplot(delta.T, ylim = ylim, las = 1, col = "lightskyblue", border = NA, axes = FALSE, axisnames = FALSE, ylab = "", add = TRUE) axis(1, at = bp[x.tick], labels = x.tick) axis(2, at = pretty(ylim), labels = y.ax, las = 1) legend("top", legend = rt.T, inset = 0.02, , box.lty = 0, cex = 0.9, x.intersp = 0) ############# # Reference # ############# bp <- barplot(delta.R, plot = FALSE) plot(bp, bp, ylim = ylim, type = "n", axes = FALSE, frame.plot = TRUE, xlab = "Subject", ylab = expression(log(italic(C)[max[R]])*" – mean [ "* log(italic(C)[max[R]])*" ]")) barplot(delta.R, ylim = ylim, las = 1, col = "lightskyblue", border = NA, axes = FALSE, axisnames = FALSE, ylab = "", add = TRUE) axis(1, at = bp[x.tick], labels = x.tick) axis(2, at = pretty(ylim), labels = y.ax, las = 1) legend("top", legend = rt.R, inset = 0.02, , box.lty = 0, cex = 0.9, x.intersp = 0) ######################### # Individual T/R-ratios # ######################### plot(1:n, Cmax.ratio, type = "n", log = "y", axes = FALSE, xlab = "Subject", ylab = expression(italic(C)[max[T]]*" / "*italic(C)[max[R]])) y.tick <- sort(c(0.8, 1.25, axTicks(side = 2, log = TRUE))) axis(1, at = x.tick) axis(2, at = y.tick, las = 1) grid(nx = NA, ny = NULL) abline(v = x.tick, lty = 3, col = "lightgrey") abline(h = c(0.8, 1, 1.25), lty = c(2, 1, 2)) legend("topleft", legend = c("Individual T/R-ratio", "Cumulative T/R-ratio"), pch = c(21, NA), lty = c(0, 1), lwd = c(1, 2), inset = 0.02, col = c(rep("blue", 2)), pt.bg = "#AAAAFF80", box.lty = 0, pt.cex = 1, bg = "white", seg.len = 2.5, cex = 0.9, y.intersp = 1.25) box() lines(1:n, Cmax.cum, col = "blue", lwd = 2, type = "s") points(1:n, Cmax.ratio, pch = 21, col = "blue", bg = "#AAAAFF80") if (modeling) { ################ # PE and 90 CI # ################ ylim <- range(c(res$lower, res$upper), na.rm = TRUE) plot(1:n, Cmax.ratio, type = "n", ylim = ylim, log = "y", axes = FALSE, xlab = "Subject", ylab = "Point Estimate and 90% Confidence Interval") axis(1, at = x.tick) axis(2, at = y.tick, las = 1) grid(nx = NA, ny = NULL) abline(v = x.tick, lty = 3, col = "lightgrey") abline(h = c(0.8, 1, 1.25), lty = c(2, 1, 2)) legend("bottomright", legend = c("Point Estimate", "Confidence Limits"), lty = c(1, 1), lwd = c(2, 1), bg = "white", box.lty = 0, inset = 0.02, col = rep("blue", 2), seg.len = 2.5, cex = 0.9, y.intersp = 1.25) box() lines(res$subject, res$PE, col = "blue", lwd = 2, type = "s") lines(res$subject, res$lower, col = "blue", type = "s") lines(res$subject, res$upper, col = "blue", type = "s") ####### # MSE # ####### plot(res$subject, res$MSE, type = "n", xlim = c(1, n), axes = FALSE, xlab = "Subject", ylab = expression(italic(MSE))) axis(1, at = x.tick) axis(2, las = 1) grid(nx = NA, ny = NULL) abline(v = x.tick, lty = 3, col = "lightgrey") box() lines(res$subject, res$MSE, col = "blue", lwd = 2, type = "s") ############################# # Raw model residuals for T # ############################# ylim <- c(-1, +1) * max(abs(range(resid))) y.ax <- sprintf("%+.1f", pretty(ylim)) y.ax[y.ax == "+0.0"] <- "±0.0" bp <- barplot(resid, plot = FALSE) plot(bp, resid, ylim = ylim, type = "n", axes = FALSE, frame.plot = TRUE, xlab = "Subject", ylab = "Raw model residual (T)") barplot(resid, ylim = ylim, col = "lightskyblue", border = NA, axes = FALSE, axisnames = FALSE, ylab = "", add = TRUE) axis(1, at = bp[x.tick], labels = x.tick) axis(2, at = pretty(ylim), labels = y.ax, las = 1) legend("top", legend = rt.resid, inset = 0.02, , box.lty = 0, cex = 0.9, x.intersp = 0) ############################### # Studentized model residuals # ############################### ylim <- c(-1, +1) * max(abs(range(stud))) y.ax <- sprintf("%+.1f", pretty(ylim)) y.ax[y.ax == "+0.0"] <- "±0.0" bp <- barplot(stud, plot = FALSE) plot(bp, stud, ylim = ylim, type = "n", axes = FALSE, frame.plot = TRUE, xlab = "Subject", ylab = "Studentized model residual") abline(h = c(-1, +1) * qnorm(0.975), lty = 3, col = "lightgrey") barplot(stud, ylim = ylim, col = "lightskyblue", border = NA, axes = FALSE, axisnames = FALSE, ylab = "", add = TRUE) axis(1, at = bp[x.tick], labels = x.tick) axis(2, at = pretty(ylim), labels = y.ax, las = 1) } par(op)

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

Another two... Ohlbe 2021-09-16 23:41 [GxP / QC / QA]
- Another two... ElMaestro 2021-09-17 08:42
  - Dates Ohlbe 2021-09-17 10:15
  - Some more info Ohlbe 2021-09-17 20:28
    - EU Article 31 referral started Ohlbe 2022-01-24 17:47
- Cheating Helmut 2021-09-17 16:50
  - Increased variability Ohlbe 2021-09-17 17:56
    - Increased variability ElMaestro 2021-09-17 18:51
      - Visualization Helmut 2021-09-18 15:26
- Another two... jag009 2021-09-19 09:18
- Another two... ElMaestro 2022-05-24 12:00
  - Blind monitors or greedy sponsors? Helmut 2022-05-24 12:51
    - Blind monitors or greedy sponsors? Ohlbe 2022-05-24 14:21
      - Blind monitors or greedy sponsors? Helmut 2022-05-24 16:35
    - Blind monitors or greedy sponsors? ElMaestro 2022-05-25 08:11
      - (Cumulative) T/R-ratio vs. time Helmut 2022-05-25 09:19
        
        Sponsors and CRO selection Ohlbe 2022-05-25 10:53
        
        I still think that T/R-ratios are useful Helmut 2022-05-25 12:04
        
        T/R-ratios are useful Ohlbe 2022-05-25 14:40
        
        T/R-ratios are useful Helmut 2022-05-25 15:35
        Bust the BusterHelmut 2022-05-26 15:32
        
        complicate the assessor's life mittyri 2022-05-26 18:28
        
        complicate the assessor's life PharmCat 2022-05-27 09:57
        complicate the assessor's life Helmut 2022-05-27 10:11
        
        Thanks for Busting the Buster sameep 2022-05-31 08:34
        
        Thanks for Busting the Buster Helmut 2022-05-31 15:22
        
        A small point for the code sameep 2022-06-01 13:17
        
        A small point for the code Helmut 2022-06-01 14:11