Broken-stick regression [Bioanalytics]
❝ I sent 2 emails with csv and excel file, the latter has much more columns in case you needed it.
THX!
❝ ❝ An idea: Set limits based on the one batch of the method validation which was performed in the anticipated maximum size of a run.
❝
❝ Damn, that's a nice idea. I didn't thought about it during discussion with CRO about the modification of their SOP.
We can perform a broken-stick regression:
library(segmented)
data <- NULL # nullify eventual previous one
data <- read.csv("https://bebac.at/downloads/Post21469.csv", sep = ",", dec = ".",
stringsAsFactors = TRUE)
data <- data[!is.na(data$IS.area), ] # remove ones without IS
attach(data)
set.seed(123456)
mod <- lm(IS.area ~ injection, data = data)
seg <- segmented(mod)
con <- predict(seg, se.fit = TRUE, interval = "confidence")
pre <- suppressWarnings(predict(seg, se.fit = TRUE, interval = "prediction"))
summary(seg); confint(seg); slope(seg, digits = 9); intercept(seg, digits = 9)$injection
if (is.null(attr(dev.list(), "names"))) windows(width = 6, height = 5)
op <- par(no.readonly = TRUE)
par(mar=c(4, 4, 0, 0.3) + 0.1, cex.axis = 1, cex.lab = 1.25)
col <- c("blue", "red", "darkgreen", "magenta")
bg <- c("#0000FF80", "#FF000080", "#00900080", "#FF00FF80")
plot(seg, ylim = range(IS.area, na.rm = TRUE),
xlab = "Injection Number", ylab = "IS Area", lwd = 3, col = "blue")
grid()
abline(v = confint(seg), col = "blue", lty = c(1, 3, 3))
text(confint(seg)[2], max(IS.area, na.rm = TRUE), labels = round(confint(seg)[2]),
pos = 2, cex = 0.9)
text(confint(seg)[1], min(IS.area, na.rm = TRUE), labels = round(confint(seg)[1]),
pos = 4, cex = 0.9)
text(confint(seg)[3], max(IS.area, na.rm = TRUE), labels = round(confint(seg)[3]),
pos = 4, cex = 0.9)
lines(as.numeric(dimnames(pre$fit)[[1]]), con$fit[, "lwr"], col = "blue")
lines(as.numeric(dimnames(pre$fit)[[1]]), con$fit[, "upr"], col = "blue")
lines(as.numeric(dimnames(pre$fit)[[1]]), pre$fit[, "lwr"], col = "blue", lty = 2)
lines(as.numeric(dimnames(pre$fit)[[1]]), pre$fit[, "upr"], col = "blue", lty = 2)
points(injection[type == "C"], IS.area[type == "C"], pch = 21, col = col[1], bg = bg[1],
cex = 1.35)
points(injection[type == "B"], IS.area[type == "B"], pch = 21, col = col[2], bg = bg[2],
cex = 1.35)
points(injection[type == "Q"], IS.area[type == "Q"], pch = 21, col = col[3], bg = bg[3],
cex = 1.35)
points(injection[type == "U"], IS.area[type == "U"], pch = 21, col = col[4], bg = bg[4],
cex = 1.35)
par(op)
Which gives
***Regression Model with Segmented Relationship(s)***
Call:
segmented.lm(obj = mod)
Estimated Break-Point(s):
Est. St.Err
psi1.injection 107.392 1.054
Meaningful coefficients of the linear terms:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 458563.17 3100.28 147.910 < 2e-16 ***
injection 151.47 49.54 3.058 0.00269 **
U1.injection 5439.53 221.22 24.589 NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 15150 on 135 degrees of freedom
Multiple R-Squared: 0.9462, Adjusted R-squared: 0.945
Convergence attained in 2 iter. (rel. change 0)
Est. CI(95%).low CI(95%).up
psi1.injection 107.392 105.306 109.477
$injection
Est. St.Err. t value CI(95%).l CI(95%).u
slope1 151.4664 49.53669 3.057662 53.49811 249.4348
slope2 5591.0012 215.60080 25.932191 5164.60917 6017.3932
Est.
intercept1 458563.2
intercept2 -125596.8
The fit is fine, you can try
psi <- round(confint(seg)[1])
limb1 <- data[(data$injection <= psi), ]
limb2 <- data[(data$injection > psi), ]
st.r <- rstudent(seg)
inj <- as.numeric(names(st.r))
ylim <- c(-1, 1)*max(abs(st.r)) # symmetric
col <- c("blue", "red")
bg <- c("#0000FF80", "#FF000080")
plot(inj, st.r, type = "n", ylim = ylim, xlab = "Injection",
ylab = "Studentized residual")
grid(); box(); abline(h = 0)
rug(inj, ticksize = 0.015); rug(st.r, side = 2, ticksize = 0.015)
points(inj[inj <= psi], st.r[inj <= psi], pch = 21, col = col[1],
bg = bg[1], cex = 1.35)
points(inj[inj > psi], st.rs[inj > psi], pch = 21, col = col[2],
bg = bg[2], cex = 1.35)
plot(seg$fitted, st.r, type = "n", ylim = ylim, xlab = "Fitted",
ylab = "Studentized residual")
grid(); box(); abline(h = 0)
rug(seg$fitted, ticksize = 0.015); rug(st.r, side = 2, ticksize = 0.015)
points(seg$fitted[inj <= psi], st.r[inj <= psi], pch = 21, col = col[1],
bg = bg[1], cex = 1.35)
points(seg$fitted[inj > psi], st.r[inj > psi], pch = 21, col = col[2],
bg = bg[2], cex = 1.35)
1st residual plot: No trend, no funnel shape, similar spread in the two limbs (in the second a bit larger).
2nd residual plot: The two clusters are evident which, of course, we know from looking at the entire batch (fit, 95% confidence and prediction intervals, breakpoint and its 95% confidence interval):
Well, we can test that.* H0 is that the difference of slopes = 0.
pscore.test(seg, k = nrow(data), alternative = "two.sided")
Score test for one/two changes in the slope
data: formula = IS.area ~ injection
breakpoint for variable = injection
model = gaussian , link = identity , method = segmented.lm
observed value = 26.288, n.points = 139, p-value < 2.2e-16
alternative hypothesis: two.sided (1 breakpoint)
❝ ❝ If you have data of this batch, please give it as well.
❝ Sorry, I don't have it.
I will try to simulate some stuff based on the first part. Then we could set control limits based on Bland-Altman.
Edit: I was asking myself how sensitive this approach is, i.e., how much can the slopes differ to get a significant p-value. Crude method: I fitted the 2nd limb and gradually decreased the slope until it was equal to the one of the 1st limb.
psi <- round(confint(seg)[1])
slope <- as.numeric(slope(seg, digits=15)$injection[, 1])
intcpt <- as.numeric(intercept(seg, digits=15)$injection[, 1])
limb1 <- data[(data$injection <= psi), c(1, 4)]
limb2 <- data[(data$injection > psi), c(1, 4)]
fit1 <- intcpt[1]+slope[2]*limb1$injection
fit2 <- intcpt[2]+slope[2]*limb2$injection
sims <- 50
slp.new <- seq(slope[2], slope[1], length.out = sims)
int.new <- seq(intcpt[2], intcpt[1], length.out = sims)
noise <- intcpt[2]+slope[2]*limb2$injection - limb2$IS.area
res <- data.frame(slope = rep(slp.new, each = nrow(limb2)),
intercept = rep(int.new, each = nrow(limb2)),
inj = limb2$injection, fit = NA,
noise = rep(noise, length(slp.new)),
IS.area = NA)
for (j in 1:nrow(res)) {
res$fit[j] <- res$intercept[j]+res$slope[j]*res$inj[j]
res$IS.area[j] <- res$fit[j]+res$noise[j]
}
names(res)[3] <- "injection"
sens <- data.frame(slope.ratio = slp.new/slope[1],
p.value = NA, sign = FALSE)
for (j in 1:sims) {
tmp <- res[, c(1, 3, 6)]
sel <- which(tmp$slope == slp.new[j])
tmp <- rbind(limb1, tmp[sel, 2:3])
m1 <- lm(IS.area ~ injection, data = tmp)
s1 <- segmented(m1)
sens$p.value[j] <- pscore.test(s1, k = nrow(tmp),
alternative = "two.sided")$p.value
if (sens$p.value[j] < 0.05) sens$sign[j] <- TRUE
}
sens[, 1:2] <- signif(sens[, 1:2], 4)
print(sens, row.names = FALSE)
slope.ratio p.value sign
36.910 3.408e-55 TRUE
36.180 3.607e-54 TRUE
35.450 3.946e-53 TRUE
34.710 4.468e-52 TRUE
33.980 5.237e-51 TRUE
33.250 6.357e-50 TRUE
32.520 7.998e-49 TRUE
31.780 1.043e-47 TRUE
31.050 1.412e-46 TRUE
30.320 1.982e-45 TRUE
29.580 2.889e-44 TRUE
28.850 4.372e-43 TRUE
28.120 6.871e-42 TRUE
27.380 1.121e-40 TRUE
26.650 1.900e-39 TRUE
25.920 3.343e-38 TRUE
25.190 6.102e-37 TRUE
24.450 1.155e-35 TRUE
23.720 2.266e-34 TRUE
22.990 4.599e-33 TRUE
22.250 9.649e-32 TRUE
21.520 2.088e-30 TRUE
20.790 4.650e-29 TRUE
20.060 1.063e-27 TRUE
19.320 2.486e-26 TRUE
18.590 5.924e-25 TRUE
17.860 1.432e-23 TRUE
17.120 3.492e-22 TRUE
16.390 8.538e-21 TRUE
15.660 2.078e-19 TRUE
14.930 4.994e-18 TRUE
14.190 1.174e-16 TRUE
13.460 2.670e-15 TRUE
12.730 5.811e-14 TRUE
11.990 1.194e-12 TRUE
11.260 2.283e-11 TRUE
10.530 4.000e-10 TRUE
9.795 6.318e-09 TRUE
9.062 8.835e-08 TRUE
8.329 1.074e-06 TRUE
7.596 1.115e-05 TRUE
6.863 9.702e-05 TRUE
6.130 5.533e-04 TRUE
5.397 3.312e-03 TRUE
4.665 1.615e-02 TRUE
3.932 6.266e-02 FALSE
3.199 1.904e-01 FALSE
2.466 4.514e-01 FALSE
1.733 8.453e-01 FALSE
1.000 7.228e-01 FALSE
TODO: Scale the noise (which was higher in the 2nd than in the 1st limb.
- Muggeo VMR. Testing with a nuisance parameter present only under the alternative: a score-based approach with application to segmented modelling. J Stat Comp Sim. 2016;86(15):3059–3067. doi:10.1080/00949655.2016.1149855.
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
![[image]](https://static.bebac.at/pics/Blue_and_yellow_ribbon_UA.png)
Helmut Schütz
![[image]](https://static.bebac.at/img/CC by.png)
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- An example of IS response. Need your helps. Shuanghe 2020-05-26 12:56 [Bioanalytics]
- An example of IS response. Need your helps. Sukalpa Biswas 2020-05-26 13:49
- An example of IS response. Need your helps. Shuanghe 2020-05-26 16:32
- An example of IS response. Need your helps. Ohlbe 2020-05-26 17:46
- An example of IS response. Need your helps. Ohlbe 2020-05-26 14:42
- Placement of CC Helmut 2020-05-26 15:35
- An example of IS response. Shuanghe 2020-05-26 16:32
- An example of IS response. Ohlbe 2020-05-26 17:27
- Placement of CC Ohlbe 2020-05-26 17:32
- Placement of CC Helmut 2020-05-26 19:03
- Duplicate LLOQ and ULOQ Ohlbe 2020-05-26 19:43
- Duplicate LLOQ and ULOQ Helmut 2020-05-27 16:30
- Duplicate LLOQ and ULOQ Ohlbe 2020-05-26 19:43
- Placement of CC Helmut 2020-05-26 19:03
- An example of IS response. Shuanghe 2020-05-26 16:32
- Placement of CC Helmut 2020-05-26 15:35
- An example of IS response. Need your helps. ElMaestro 2020-05-26 15:44
- An example of IS response. Need your helps. ksreekanth 2020-07-06 13:37
- Trend analysis as per recent USFDA guideline dshah 2020-08-01 07:32
- Only for QCs; encouraged ≠ required Helmut 2020-08-01 14:13
- Only for QCs; encouraged ≠ required dshah 2020-08-03 09:17
- Deviation from the mean response Helmut 2020-08-07 13:28
- Only for QCs; encouraged ≠ required Shuanghe 2020-08-03 13:56
- Broken-stick regressionHelmut 2020-08-04 18:04
- Only for QCs; encouraged ≠ required dshah 2020-08-03 09:17
- Only for QCs; encouraged ≠ required Helmut 2020-08-01 14:13
- An example of IS response. Need your helps. Sukalpa Biswas 2020-05-26 13:49