Bioequivalence and Bioavailability Forum • Broken-stick regression

Broken-stick regression [Bioanalytics]

posted by Helmut – Vienna, Austria, 2020-08-04 20:04 (1358 d 07:53 ago) – Posting: # 21822
Views: 5,659

Hi Shuanghe,

❝ I sent 2 emails with csv and excel file, the latter has much more columns in case you needed it.

THX!

❝ ❝ An idea: Set limits based on the one batch of the method validation which was performed in the anticipated maximum size of a run.

❝

❝ Damn, that's a nice idea. I didn't thought about it during discussion with CRO about the modification of their SOP.

We can perform a broken-stick regression:

library(segmented) data <- NULL # nullify eventual previous one data <- read.csv("https://bebac.at/downloads/Post21469.csv", sep = ",", dec = ".", stringsAsFactors = TRUE) data <- data[!is.na(data$IS.area), ] # remove ones without IS attach(data) set.seed(123456) mod <- lm(IS.area ~ injection, data = data) seg <- segmented(mod) con <- predict(seg, se.fit = TRUE, interval = "confidence") pre <- suppressWarnings(predict(seg, se.fit = TRUE, interval = "prediction")) summary(seg); confint(seg); slope(seg, digits = 9); intercept(seg, digits = 9)$injection if (is.null(attr(dev.list(), "names"))) windows(width = 6, height = 5) op <- par(no.readonly = TRUE) par(mar=c(4, 4, 0, 0.3) + 0.1, cex.axis = 1, cex.lab = 1.25) col <- c("blue", "red", "darkgreen", "magenta") bg <- c("#0000FF80", "#FF000080", "#00900080", "#FF00FF80") plot(seg, ylim = range(IS.area, na.rm = TRUE), xlab = "Injection Number", ylab = "IS Area", lwd = 3, col = "blue") grid() abline(v = confint(seg), col = "blue", lty = c(1, 3, 3)) text(confint(seg)[2], max(IS.area, na.rm = TRUE), labels = round(confint(seg)[2]), pos = 2, cex = 0.9) text(confint(seg)[1], min(IS.area, na.rm = TRUE), labels = round(confint(seg)[1]), pos = 4, cex = 0.9) text(confint(seg)[3], max(IS.area, na.rm = TRUE), labels = round(confint(seg)[3]), pos = 4, cex = 0.9) lines(as.numeric(dimnames(pre$fit)[[1]]), con$fit[, "lwr"], col = "blue") lines(as.numeric(dimnames(pre$fit)[[1]]), con$fit[, "upr"], col = "blue") lines(as.numeric(dimnames(pre$fit)[[1]]), pre$fit[, "lwr"], col = "blue", lty = 2) lines(as.numeric(dimnames(pre$fit)[[1]]), pre$fit[, "upr"], col = "blue", lty = 2) points(injection[type == "C"], IS.area[type == "C"], pch = 21, col = col[1], bg = bg[1], cex = 1.35) points(injection[type == "B"], IS.area[type == "B"], pch = 21, col = col[2], bg = bg[2], cex = 1.35) points(injection[type == "Q"], IS.area[type == "Q"], pch = 21, col = col[3], bg = bg[3], cex = 1.35) points(injection[type == "U"], IS.area[type == "U"], pch = 21, col = col[4], bg = bg[4], cex = 1.35) par(op)

Which gives

***Regression Model with Segmented Relationship(s)*** Call: segmented.lm(obj = mod) Estimated Break-Point(s): Est. St.Err psi1.injection 107.392 1.054 Meaningful coefficients of the linear terms: Estimate Std. Error t value Pr(>|t|) (Intercept) 458563.17 3100.28 147.910 < 2e-16 *** injection 151.47 49.54 3.058 0.00269 ** U1.injection 5439.53 221.22 24.589 NA --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 15150 on 135 degrees of freedom Multiple R-Squared: 0.9462, Adjusted R-squared: 0.945 Convergence attained in 2 iter. (rel. change 0) Est. CI(95%).low CI(95%).up psi1.injection 107.392 105.306 109.477 $injection Est. St.Err. t value CI(95%).l CI(95%).u slope1 151.4664 49.53669 3.057662 53.49811 249.4348 slope2 5591.0012 215.60080 25.932191 5164.60917 6017.3932 Est. intercept1 458563.2 intercept2 -125596.8

The fit is fine, you can try

psi <- round(confint(seg)[1]) limb1 <- data[(data$injection <= psi), ] limb2 <- data[(data$injection > psi), ] st.r <- rstudent(seg) inj <- as.numeric(names(st.r)) ylim <- c(-1, 1)*max(abs(st.r)) # symmetric col <- c("blue", "red") bg <- c("#0000FF80", "#FF000080") plot(inj, st.r, type = "n", ylim = ylim, xlab = "Injection", ylab = "Studentized residual") grid(); box(); abline(h = 0) rug(inj, ticksize = 0.015); rug(st.r, side = 2, ticksize = 0.015) points(inj[inj <= psi], st.r[inj <= psi], pch = 21, col = col[1], bg = bg[1], cex = 1.35) points(inj[inj > psi], st.rs[inj > psi], pch = 21, col = col[2], bg = bg[2], cex = 1.35) plot(seg$fitted, st.r, type = "n", ylim = ylim, xlab = "Fitted", ylab = "Studentized residual") grid(); box(); abline(h = 0) rug(seg$fitted, ticksize = 0.015); rug(st.r, side = 2, ticksize = 0.015) points(seg$fitted[inj <= psi], st.r[inj <= psi], pch = 21, col = col[1], bg = bg[1], cex = 1.35) points(seg$fitted[inj > psi], st.r[inj > psi], pch = 21, col = col[2], bg = bg[2], cex = 1.35)

[image]

12 |stud. res| > 2 (8.6%) and 1 > 3 (0.7%). Doesn’t worry me.
1^st residual plot: No trend, no funnel shape, similar spread in the two limbs (in the second a bit larger).
2^nd residual plot: The two clusters are evident which, of course, we know from looking at the entire batch (fit, 95% confidence and prediction intervals, breakpoint and its 95% confidence interval):

[image]

It doesn’t need a spectacled rocket scientist to see that the slopes are different.
Well, we can test that.* H₀ is that the difference of slopes = 0.

pscore.test(seg, k = nrow(data), alternative = "two.sided")

Gives

Score test for one/two changes in the slope data: formula = IS.area ~ injection breakpoint for variable = injection model = gaussian , link = identity , method = segmented.lm observed value = 26.288, n.points = 139, p-value < 2.2e-16 alternative hypothesis: two.sided (1 breakpoint)

Gotcha!

❝ ❝ If you have data of this batch, please give it as well.

❝ Sorry, I don't have it.

I will try to simulate some stuff based on the first part. Then we could set control limits based on Bland-Altman.

Edit: I was asking myself how sensitive this approach is, i.e., how much can the slopes differ to get a significant p-value. Crude method: I fitted the 2^nd limb and gradually decreased the slope until it was equal to the one of the 1^st limb.

psi <- round(confint(seg)[1]) slope <- as.numeric(slope(seg, digits=15)$injection[, 1]) intcpt <- as.numeric(intercept(seg, digits=15)$injection[, 1]) limb1 <- data[(data$injection <= psi), c(1, 4)] limb2 <- data[(data$injection > psi), c(1, 4)] fit1 <- intcpt[1]+slope[2]*limb1$injection fit2 <- intcpt[2]+slope[2]*limb2$injection sims <- 50 slp.new <- seq(slope[2], slope[1], length.out = sims) int.new <- seq(intcpt[2], intcpt[1], length.out = sims) noise <- intcpt[2]+slope[2]*limb2$injection - limb2$IS.area res <- data.frame(slope = rep(slp.new, each = nrow(limb2)), intercept = rep(int.new, each = nrow(limb2)), inj = limb2$injection, fit = NA, noise = rep(noise, length(slp.new)), IS.area = NA) for (j in 1:nrow(res)) { res$fit[j] <- res$intercept[j]+res$slope[j]*res$inj[j] res$IS.area[j] <- res$fit[j]+res$noise[j] } names(res)[3] <- "injection" sens <- data.frame(slope.ratio = slp.new/slope[1], p.value = NA, sign = FALSE) for (j in 1:sims) { tmp <- res[, c(1, 3, 6)] sel <- which(tmp$slope == slp.new[j]) tmp <- rbind(limb1, tmp[sel, 2:3]) m1 <- lm(IS.area ~ injection, data = tmp) s1 <- segmented(m1) sens$p.value[j] <- pscore.test(s1, k = nrow(tmp), alternative = "two.sided")$p.value if (sens$p.value[j] < 0.05) sens$sign[j] <- TRUE } sens[, 1:2] <- signif(sens[, 1:2], 4) print(sens, row.names = FALSE)

Gives

slope.ratio p.value sign 36.910 3.408e-55 TRUE 36.180 3.607e-54 TRUE 35.450 3.946e-53 TRUE 34.710 4.468e-52 TRUE 33.980 5.237e-51 TRUE 33.250 6.357e-50 TRUE 32.520 7.998e-49 TRUE 31.780 1.043e-47 TRUE 31.050 1.412e-46 TRUE 30.320 1.982e-45 TRUE 29.580 2.889e-44 TRUE 28.850 4.372e-43 TRUE 28.120 6.871e-42 TRUE 27.380 1.121e-40 TRUE 26.650 1.900e-39 TRUE 25.920 3.343e-38 TRUE 25.190 6.102e-37 TRUE 24.450 1.155e-35 TRUE 23.720 2.266e-34 TRUE 22.990 4.599e-33 TRUE 22.250 9.649e-32 TRUE 21.520 2.088e-30 TRUE 20.790 4.650e-29 TRUE 20.060 1.063e-27 TRUE 19.320 2.486e-26 TRUE 18.590 5.924e-25 TRUE 17.860 1.432e-23 TRUE 17.120 3.492e-22 TRUE 16.390 8.538e-21 TRUE 15.660 2.078e-19 TRUE 14.930 4.994e-18 TRUE 14.190 1.174e-16 TRUE 13.460 2.670e-15 TRUE 12.730 5.811e-14 TRUE 11.990 1.194e-12 TRUE 11.260 2.283e-11 TRUE 10.530 4.000e-10 TRUE 9.795 6.318e-09 TRUE 9.062 8.835e-08 TRUE 8.329 1.074e-06 TRUE 7.596 1.115e-05 TRUE 6.863 9.702e-05 TRUE 6.130 5.533e-04 TRUE 5.397 3.312e-03 TRUE 4.665 1.615e-02 TRUE 3.932 6.266e-02 FALSE 3.199 1.904e-01 FALSE 2.466 4.514e-01 FALSE 1.733 8.453e-01 FALSE 1.000 7.228e-01 FALSE

The slope of the 2^nd limb can be ~4 times higher than the 1^st till it gets detected.

[image]

TODO: Scale the noise (which was higher in the 2^nd than in the 1^st limb.

Muggeo VMR. Testing with a nuisance parameter present only under the alternative: a score-based approach with application to segmented modelling. J Stat Comp Sim. 2016;86(15):3059–3067. doi:10.1080/00949655.2016.1149855.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

An example of IS response. Need your helps. Shuanghe 2020-05-26 12:56 [Bioanalytics]
- An example of IS response. Need your helps. Sukalpa Biswas 2020-05-26 13:49
  - An example of IS response. Need your helps. Shuanghe 2020-05-26 16:32
  - An example of IS response. Need your helps. Ohlbe 2020-05-26 17:46
- An example of IS response. Need your helps. Ohlbe 2020-05-26 14:42
  - Placement of CC Helmut 2020-05-26 15:35
    - An example of IS response. Shuanghe 2020-05-26 16:32
      - An example of IS response. Ohlbe 2020-05-26 17:27
    - Placement of CC Ohlbe 2020-05-26 17:32
      - Placement of CC Helmut 2020-05-26 19:03
        
        Duplicate LLOQ and ULOQ Ohlbe 2020-05-26 19:43
        
        Duplicate LLOQ and ULOQ Helmut 2020-05-27 16:30
- An example of IS response. Need your helps. ElMaestro 2020-05-26 15:44
- An example of IS response. Need your helps. ksreekanth 2020-07-06 13:37
- Trend analysis as per recent USFDA guideline dshah 2020-08-01 07:32
  - Only for QCs; encouraged ≠ required Helmut 2020-08-01 14:13
    - Only for QCs; encouraged ≠ required dshah 2020-08-03 09:17
      - Deviation from the mean response Helmut 2020-08-07 13:28
    - Only for QCs; encouraged ≠ required Shuanghe 2020-08-03 13:56
      - Broken-stick regressionHelmut 2020-08-04 18:04