ElMaestro
★★★

Denmark,
2017-10-06 23:01

Posting: # 17871
Views: 13,904

## Goodness of fits: one model, different datasets [General Sta­tis­tics]

Hi all,

I have some different datasets that are not large at all, and I am doing some model fits on them and from that I am extracting a correlation measure like r squared etc. For the sake of simplicity let us just assume there in a single model like a linear regression without weights.

I would like to compare apples and pears, or at least the fits of three different datasets A, B and C. These datasets do not have the same number of data points, so I cannot fairly compare e.g. r squared of A vs B vs C directly.

Akaike and Schwarz are presumably not the way to go, I think, as I am not varying the model but the dataset, so to say. Kolmogorov-Smirnoff would potentially be useful if I had a boatload of points, which I don't anyway. I am very poor at explaining what I think I am looking for but I would call it a "fit likelihood" or "correlation statistic that is sample size corrected" . Google and Wikipedia aren't my friends in this regard (although on all other matters, including politics, religion, science and baking recipes G. and W. are always providing the right answers).

Does anyone here know of a handy statistic that allows a fair comparison of goodness of fits across datasets with unequal sizes, given a single model??

Muchas gracias.

I could be wrong, but...
Best regards,
ElMaestro
nobody
nothing

2017-10-07 16:03

@ ElMaestro
Posting: # 17872
Views: 13,223

## Goodness of fits: one model, different datasets

Hi Danmark!

» ... I am very poor at explaining what I think I am looking for but I would call it a »"fit likelihood" or "correlation statistic that is sample size corrected"...

Ehhm, whut? Maybe you are to fixated for statical testing? How about some oldfashioned "visual inspection of fit" paired with "back-calculated value of calibration samples from calibration function, including relative deviation"?

Just saying

No idea what you want to compare in the end, so...

Kindest regards, nobody
ElMaestro
★★★

Denmark,
2017-10-07 18:06
(edited by ElMaestro on 2017-10-07 19:11)

@ nobody
Posting: # 17873
Views: 13,311

## Experimental setup, details

Hi nobody,

» Ehhm, whut? Maybe you are to fixated for statical testing? How about some oldfashioned "visual inspection of fit" paired with "back-calculated value of calibration samples from calibration function, including relative deviation"?
»
» Just saying

Your idea is absolutely good. But there are complications...

» No idea what you want to compare in the end, so...

Let us say I have a cattle prod, a long-haired Austrian, and a microphone that is rigged up to my computer for recording.

The cattle prod is a wonderful tool, it makes persuasion such an easy task. My cattle prod has a lot of knobs and buttons and sliders that allow me to tweak combinations of:
- Tesla coil zap modifier strength
- Evil discharge combobulator intensity
- Ion stream barbaric voltage gain
- Apocalyptic wolfram anode ray modulation
...and so forth.

I am zapping the Austrian is his sorry rear and recording the intensity (dB) of his immediate...shall we say.... verbal reaction. Let me add, each experiment takes no more than a few seconds but there is a bit of recovery time before the next experiment can be run and that isn't related to latency of the prod. Surprisingly, under the present experimental conditions my sensitive microphone also records a lot of cursing and swearing after each zap. I am not sure why or what the source is, but it is a matter I am looking into as I am not sure such behavioral excursions are good for my poor sensitive electronics.

Anyways, I know from the past that with the default prod settings there is a decent -though not overwhelmingly good- correlation of the reaction recorded in dB and the prod's output measured in GJ per zap. I have done that experiment with 16 evenly dispersed levels between 0 and 17500 GJ/zap and I got an r squared of 0.9583.

The question is if I can improve the correlation by playing a little around with the seetings mentioned above. So I acquired via Amazon an ACME Automated Experimental Prod Analyzer Kit (only $598 w. 2 years of warranty) so that I can automate the experiments comprising of changing a lot of setting, actuating the prod and recording the resulting reaction. So I have generated 9629 datasets, some of which have 16 data points and some have 15, 14 or 13 because of missing values (in practice this means a relative lack of contact between the tip of the prod when actuated and the Austrian's butt). I need to find the settings for which I am most likely to be able give a zap that triggers a 138 dB (+/- 0.1 dB) response. Hence I need the correlation to be as good as possible, including a correction for sample size. With 9629 datasets I cannot visually inspect every single set and determine the best fit. I must find a solution in R. This is my grave predicament. "Help me noone kenobi, you are my only hope". I could be wrong, but... Best regards, ElMaestro ElMaestro ★★★ Denmark, 2017-10-07 19:07 @ ElMaestro Posting: # 17874 Views: 13,339 ## Visualization Dear all, for anyone interested in the setup please follow this link. Around 0:42 into the educational journey you will see an early analog version of a cattle prod (actually back then the concenpt was only semi-developed so some may call it only a cat prod), and a few seconds later you will witness the natural consequence of the prod's encounter with it's intended biological target. I believe the reaction here amounts to only roughly 115 dB, arguably because of the relative lack of advanced settings and the choice of population. I could be wrong, but... Best regards, ElMaestro Helmut ★★★ Vienna, Austria, 2017-10-08 17:17 @ ElMaestro Posting: # 17875 Views: 13,201 ## multiple regression? Hi ElMaestro, nice setup. Given » […] a lot of knobs and buttons and sliders that allow me to tweak combinations of: » - Tesla coil zap modifier strength » - Evil discharge combobulator intensity » - Ion stream barbaric voltage gain » - Apocalyptic wolfram anode ray modulation » ...and so forth. I’m I right that you are aiming at multiple regression (i.e., >1 regressor and and one regressand)? See there for an arsenal of methods. For the same number of regressors the adjusted R2 takes the number of data points into account. x1 <- rnorm(10) y1 <- x1*2+rnorm(10, 0, 0.1) muddle1 <- lm(y1 ~ x1) R2.adj1 <- summary(muddle1)$adj.r.squared x2 <- x1[-1] # drop 1 y2 <- y1[-1] # drop 1 muddle2 <- lm(y2 ~ x2) R2.adj2 <- summary(muddle2)$adj.r.squared cat(R2.adj1, R2.adj2, "\n") Cheers, Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. ☼ Science Quotes ElMaestro ★★★ Denmark, 2017-10-08 17:30 @ Helmut Posting: # 17876 Views: 13,203 ## just y=ax+b Hi Hötzi, I simply have a lot of regression lines each of the type y=ax+b. I just need to find the best one and the only complication is that not all lines have the same number of points. No multiple regression and no fancy stuff like weights or funky residual distributions. I could be wrong, but... Best regards, ElMaestro Helmut ★★★ Vienna, Austria, 2017-10-08 17:35 @ ElMaestro Posting: # 17877 Views: 13,147 ## just y=ax+b Hi ElMeastro, » I simply have a lot of regression lines each of the type y=ax+b. I just need to find the best one and the only complication is that not all lines have the same number of points. See my edit above. Do you have the raw data or only the parameters a1, a2, … ai, b1, b2, … bi, and the sizes? In the former case easy (max. R²adj), in the latter no idea. Cheers, Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. ☼ Science Quotes ElMaestro ★★★ Denmark, 2017-10-08 17:50 @ Helmut Posting: # 17878 Views: 13,238 ## just y=ax+b Hi again, and thanks for looking into it. It is as simple as this:  x1=c(1,2,3,4,5) y1=c(10.5, 11.4, 12.6, 13.3, 14.6) x2=c(1,2,3,4,5) y2=c(10.3, 11.4, NA, 13.5, 14.6) L1=lm(y1~x1) L2=lm(y2~x2) summary(L1) summary(L2)  I would like to know how to judge if the correlation is quantitatively appearing to be best for the first line or for the second line. I could be wrong, but... Best regards, ElMaestro nobody nothing 2017-10-08 20:26 (edited by nobody on 2017-10-08 20:45) @ ElMaestro Posting: # 17879 Views: 13,248 ## just y=ax+b » x1=c(1,2,3,4,5) » y1=c(10.5, 11.4, 12.6, 13.3, 14.6) » » x2=c(1,2,3,4,5) » y2=c(10.3, 11.4, NA, 13.5, 14.6) » 13.3 vs. 13.5? By intention? "RMSE The RMSE is the square root of the variance of the residuals. It indicates the absolute fit of the model to the data–how close the observed data points are to the model’s predicted values. Whereas R-squared is a relative measure of fit, RMSE is an absolute measure of fit. As the square root of a variance, RMSE can be interpreted as the standard deviation of the unexplained variance, and has the useful property of being in the same units as the response variable. Lower values of RMSE indicate better fit. RMSE is a good measure of how accurately the model predicts the response, and is the most important criterion for fit if the main purpose of the model is prediction." ...educated guess, no idea if sensitive for number of calibrators http://www.theanalysisfactor.com/assessing-the-fit-of-regression-models/ ..didn't read it, maybe you want to have a look: http://orfe.princeton.edu/~jqfan/papers/01/ant.pdf Kindest regards, nobody yjlee168 ★★ Kaohsiung, Taiwan, 2017-10-08 21:28 (edited by yjlee168 on 2017-10-09 08:35) @ ElMaestro Posting: # 17880 Views: 13,187 ## ANCOVA with R? Dear Elmaestro, I am not quite sure if ANCOVA is what you need. Here is a 10-min video (title: ANCOVA in R) from Youtube. All the best, ---Yung-jin Lee bear v2.8.4:- created by Hsin-ya Lee & Yung-jin Lee Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear Download link (updated) -> here DavidManteigas ★ Portugal, 2017-10-09 10:34 @ ElMaestro Posting: # 17881 Views: 13,121 ## just y=ax+b Dear El Maestro, That is changing the logic of statistics Find the right data for your model instead of finding the right model for your data. For those models, I think R squared is the way to go, although different sample sizes might have an influence depending on the range of samples that you have. How "small" are the samples? Between 30 and 50 observations per sample? Would methods like cross-validation work on the datasets? What about calculating confidence intervals for the adjusted R squared of each model and use a Forest plot of those CI, one line for each dataset? This might give you an idea of the impact of the different sample sizes in this particular problem and to decide whether just looking at R square would be enough. Depending on the number of datasets, visual inspection and influence statistics might help on deciding which is the best data for the model. Regards, David nobody nothing 2017-10-09 10:45 @ DavidManteigas Posting: # 17882 Views: 13,177 ## just y=ax+b Best guess: in case of unweighted linear regression the highest calibrators always have the highest impact. For your example: Take 138 dBA as the mean of the regression function, as predictions will be best arround the mean. If there is heteroscedastic data and weighted regression, the calibrators arround the weighted mean will have least impact on the calibration function. I still have not fully understood the scope of the survey. Is it on best practice of study planning for calibration experiments? Kindest regards, nobody Helmut ★★★ Vienna, Austria, 2017-10-10 18:15 @ ElMaestro Posting: # 17885 Views: 13,090 ## just y=ax+b Hi ElMaestro, » I would like to know how to judge if the correlation is quantitatively appearing to be best for the first line or for the second line. Sumfink along this? x1 <- c(1, 2, 3, 4, 5) y1 <- c(10.5, 11.4, 12.6, 13.3, 14.6) n1 <- length(x1) x2 <- c(1, 2, 3, 4, 5) y2 <- c(10.5, 11.4, NA, 13.3, 14.6) n2 <- length(x2) x3 <- c(1, 2, 4, 5) y3 <- c(10.5, 11.4, 13.3, 14.6) n3 <- length(x3) raw <- matrix(c(rep(1, n1), rep(2, n2), rep(3, n3), x1, x2, x3, y1, y2, y3, rep(n1, each=n1), rep(n2, each=n2), rep(n3, each=n3)), nrow=n1+n2+n3, ncol=4, byrow=FALSE, dimnames=list(NULL, c("set", "x", "y", "n"))) raw <- as.data.frame(raw); attach(raw) set <- unique(set) res <- data.frame(set=set, n=c(n1, n2, n3), NAs=NA, int=NA, slope=NA, RSE=NA, R2.adj=NA) for (j in seq_along(set)) { mod <- lm(y ~ x, data=raw, set==j) sum.mod <- summary(mod) res[j, "NAs"] <- sum(is.na(raw$y[raw$set==j])) res[j, "int"] <- coef(mod)[[1]] res[j, "slope"] <- coef(mod)[[2]] res[j, "RSE"] <- sum.mod$sigma   res[j, "R2.adj"] <- sum.mod$adj.r.squared } print(raw, row.names=FALSE) print(res[with(res, order(-R2.adj, +n)), ], row.names=FALSE) detach(raw) Fits linear models to the sets and gives results sorted by R²adj (descending) and n (ascending). y-values of set 2 and 3 are identical. Shows that NA gives the same result as a completely missing row due to the default na.action=na.omit in lm().  set n NAs int slope RSE R2.adj 3 4 0 9.42 1.01 0.1565248 0.9928293 2 5 1 9.42 1.01 0.1565248 0.9928293 1 5 0 9.45 1.01 0.1494434 0.9912998 Edit: Toys for boys: a <- 9.45 b <- 1.01 sets <- 16 n <- 5 x <- 1:n sd <- 0.25 pct <- 20 # percentage of NAs y <- NULL for (j in 1:sets) { y <- c(y, a + b*x + rnorm(n, 0, sd)) } raw <- matrix(c(rep(1:sets, each=n), rep(x, sets), y, rep(n, n*sets)), nrow=n*sets, ncol=4, byrow=FALSE, dimnames=list(NULL, c("set", "x", "y", "n"))) loc <- floor(runif(as.integer(pct*sets*n/100), min=1, max=n*sets)) raw[loc, "y"] <- NA raw <- as.data.frame(raw); attach(raw) set <- 1:sets res <- data.frame(set=set, n=rep(n, sets), NAs=NA, used=NA, int=NA, slope=NA, RSE=NA, R2.adj=NA) op <- par(no.readonly=TRUE) split.screen(c(4, 4)) # adjust for the number of sets! for (j in seq_along(set)) { mod <- lm(y ~ x, data=raw, set==j) sum.mod <- summary(mod) res[j, "NAs"] <- sum(is.na(raw$y[raw$set==j])) res[j, "used"] <- res[j, "n"] - res[j, "NAs"] res[j, "int"] <- coef(mod)[[1]] res[j, "slope"] <- coef(mod)[[2]] res[j, "RSE"] <- sum.mod$sigma   res[j, "R2.adj"] <- sum.mod$adj.r.squared screen(j) par(mar=c(1.5, 1.5, 1, 0.1), mgp=c(2, 0.5, 0), line=2, tcl=-0.25) if (res[j, "used"] >= 2) { plot(raw$x[raw$set==j], raw$y[raw$set==j], las=1, xlab="", ylab="", ylim=c((a+b*x[1])*0.95, (a+b*x[n])/0.95), cex.axis=0.65, cex.main=0.65, font.main=1, main=paste("Rsq.adj =", signif(res[j, "R2.adj"], 6))) ifelse (res[j, "used"] == 2, col <- "red", col <- "blue") abline(coef=coef(mod), col=col) legend("topleft", cex=0.65, box.lty=0, x.intersp=0, legend=paste0("set ", j, ": ", res[j, "used"], " used")) } } res <- res[with(res, order(-R2.adj, +n)), ] screen(head(res$set, 1)) par(mar=c(1.5, 1.5, 1, 0.1)) legend("right", cex=0.75, box.lty=0, x.intersp=0, legend="\"best\"") screen(tail(res\$set, 1)) par(mar=c(1.5, 1.5, 1, 0.1)) legend("right", cex=0.75, box.lty=0, x.intersp=0, legend="\"worst\"") close.screen(all=TRUE) par(op) detach(raw) print(res, row.names=FALSE)

Cheers,
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
Bioequivalence and Bioavailability Forum |  Admin contact
19,545 posts in 4,145 threads, 1,340 registered users;
online 12 (0 registered, 12 guests [including 4 identified bots]).
Forum time (Europe/Vienna): 11:54 CEST

Anyone who has never made a mistake
has never tried anything new.    Albert Einstein

The BIOEQUIVALENCE / BIOAVAILABILITY FORUM is hosted by
Ing. Helmut Schütz