Nasty beast [Software]

posted by Helmut Homepage – Vienna, Austria, 2019-07-18 22:26 (1530 d 13:49 ago) – Posting: # 20396
Views: 12,813

Dear Ohlbe,

❝ I made multiple attempts with "if" and conditions, and got multiple different error messages. ElMaestro gave me some hints, resulting each time in a "oh yes of course" reaction and more error messages... I probably misinterpreted the hints.


Using if(), its relatives, and loop-constructs (for(), while(), repeat()) in many cases are prone to errors – as you experienced – and can be slooow. If you have a vector of data (say y) there a various ways to access its values. Example with 25 random integers in the range 1…40:

n  <- 25
y  <- round(runif(n = n, min = 1, max = 40))
th <- 10             # threshold
n  <- length(y)      # we know but useful later
head(y, 10)          # the first 10
y[1:10]              # same
tail(y, 10)          # the last 10
y[(n - 9):n]         # same but tricky
blq <- which(y < th) # the ones are below the threshold
length(blq)          # how many?
y[blq]               # show them
y[y < th]            # same

In such a simple case which() is an overkill, though one can include many conditions which [sic] makes the code more easy to understand. If you have two vectors (say x, y) like in my last post, you can select values of x depending on which(y = condition).
Let’s check:

n    <- 50
x    <- runif(n = n, min = 1, max = 20)
a    <- 0.5
b    <- 2
y    <- a + b * x + rnorm(n = length(x),  mean = 0, sd = 2)
th   <- 10

fun1 <- function(x, y, th) { # clumsy
  x.th <- numeric()
  y.th <- numeric()
  bql  <- 0L
  for (k in seq_along(x)) {
    if (y[k] < th) {
      bql       <- bql + 1
      x.th[bql] <- x[k]
      y.th[bql] <- y[k]
    }
  }
  z <- data.frame(x.th, y.th)
  return(invisible(z))
}

fun2 <- function(x, y, th) { # better
  blq  <- which(y < th)
  x.th <- x[blq]
  y.th <- y[blq]
  z    <- data.frame(x.th, y.th)
  return(invisible(z))
}

fun3 <- function(x, y, th) { # a little bit confusing for beginners
  x.th <- x[y < th]
  y.th <- y[y < th]
  z    <- data.frame(x.th, y.th)
  return(invisible(z))
}

res1 <- fun1(x, y, th)
res2 <- fun2(x, y, th)
res3 <- fun3(x, y, th)
identical(res1, res2); identical(res1, res3) # same?
[1] TRUE
[1] TRUE

Bingo! Which one is easier to read?
What about speed?

library(microbenchmark)
res <- microbenchmark(fun1(x, y, th),
                      fun2(x, y, th),
                      fun3(x, y, th), times = 1000L)
print(res, signif = 4)
Unit: microseconds
           expr   min    lq  mean median    uq  max neval cld
 fun1(x, y, th) 173.9 181.1 196.1  186.3 189.9 1530  1000   b
 fun2(x, y, th) 163.0 170.3 183.6  175.4 179.0 3311  1000  a
 fun3(x, y, th) 163.0 169.0 185.5  174.5 178.4 3310  1000  ab

Practically the same, since vectors are short. If we play this game with longer vectors fun2() shows its strengths.

[image]


If one has more conditions any external construct will suck. We have 100 random integers (1…50) and want to get the even ones between 20 and 30 in increasing order.

x <- round(runif(n = 100, min = 1, max = 50))
x
  [1] 37 29  8 34  6 13 47 22 30 44  6 14 33 18 12 37 32
 [18] 10  6 37 27  2 43 40  7  5 47  4 32 17  7 50 39 36
 [35] 38 48 34  5 21 43 34 50 29 20 33  6 45 32 28  8  1
 [52] 26 29 19 42  9 38 31 25  4  1 23 37 31  2 26 29 24
 [69] 40 43 17 16 41 17  5 17 36 16  7  5 36 30  5  8 19
 [86] 40 42 30 33 21 13 25 21 33 16  7 33 36 19 37

One-liner:

sort(x[which(x >= 20 & x <= 30 & x %%2 == 0)])
[1] 20 22 24 26 26 28 30 30 30

Or even shorter:

sort(x[x >= 20 & x <= 30 & x %%2 == 0])
[1] 20 22 24 26 26 28 30 30 30

Good luck coding that with a loop and a nested if() { do this } else { do that } elseif { oops }.

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
Activity
 Admin contact
22,759 posts in 4,775 threads, 1,628 registered users;
19 visitors (0 registered, 19 guests [including 7 identified bots]).
Forum time: 12:16 CEST (Europe/Vienna)

Whenever a theory appears to you as the only possible one,
take this as a sign that you have neither understood the theory
nor the problem which it was intended to solve.    Karl R. Popper

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5