OT: R limbo 101 [Software]

posted by Helmut Homepage – Vienna, Austria, 2019-07-19 15:00 (1704 d 15:56 ago) – Posting: # 20401
Views: 13,569

[image]Dear Ohlbe,

❝ I thought something like if... else... would be very logical and easy. Maybe because these are real words from a language I can understand.


Perfectly understandable. Of course, these constructs exist in R and we use them all the time. The point is only that direct access of values (with a condition) is simply more efficient.
If you don’t have to deal with large data sets, both are fine. Since I deal a lot with simulations (1 mio BE studies with a sometimes high number of subjects / periods / sequences, …) I learned the hard way to avoid the ‘simple’ constructs if ever possible (given, sometimes I know that it could be done but was too stupid).
Think about my previous example. For 1 mio calls the direct access is 15times faster than the loop and if() within. It’s not unusual to have nested calls. For two it would by ~230times slower and for three already ~3,500times. Would test your patience.

❝ But it looks like R and I are following different kinds of logics.


Not necessarily so. See above.

❝ When you learn a new language and start speaking it with somebody, that person will usually show some goodwill, ignore grammatical mistakes and try and understand what you mean. R made no efforts of any kind, even though what I was trying to achieve was pretty obvious :-D


I agree. See the subject line. The man-pages of [image]’s functions were written by statisticians, regularly forgetting non-expert users. Sometimes the response you get is bizarre. Want to know how for() works?

?for
+

+
Fuck, get me out of here! esc
help(for)
Error: unexpected ')' in "help(for)"

What the heck? Shall I really try it without the closing bracket? Strange. OK, OK.
help(for
+

Aha, the closing bracket is missing!
)
Error: unexpected ')' in:
"
)"

You can’t be serious! Google, google, reading the R-Inferno Section 8.2.30. Oh my!
?`for`
help(`for`)

Both work as expected. I don’t know what’s the logic behind. Beyond me.


❝ ❝ If you have a vector of data (say y)


❝ Yeah, ElMaestro started using this kind of vocabulary too when we were exchanging email. […] All I learnt at school about vectors is the geometric side of the concept - what Wikipedia apparently calls Euclidian vector.


Correct.

❝ I had some problems understanding what an arrow between two points had to do with what I was trying to achieve.


Absolutely nothing, of course. In [image] a one-dimensional array is meant. The indexing (i.e., where it starts) differs between languages.In many languages there are three types of data structures: scalar (a single number, character, string, boolean), one- (R: vector) and multidimensional array (R: data.frame and matrix). In R there are no scalars, only vectors containing a single element instead.

x1   <- 1
x2   <- 1L
x3   <- TRUE
x4   <- NA
x5   <- "a"
comp <- matrix(data = c(c(x1, x2, x3, x4, x5),
                        c(is.vector(x1), is.vector(x2), is.vector(x3),
                          is.vector(x4), is.vector(x5)),
                        c(length(x1), length(x2), length(x3),
                          length(x4), length(x5)),
                        c(typeof(x1), typeof(x2), typeof(x3),
                          typeof(x4),typeof(x5))),
               nrow = 5, ncol = 4,
               dimnames = list(paste0("x", 1:5),
                               c("value", "vector?", "length", "type")))
print(as.data.frame(comp))

   value vector? length      type
x1     1    TRUE      1    double
x2     1    TRUE      1   integer
x3  TRUE    TRUE      1   logical
x4  <NA>    TRUE      1   logical
x5     a    TRUE      1 character


What makes [image] so powerful is yet another data type, namely the list. A list can contain any of the other types and a mixture of them (even other lists…).

char.vect <- letters[1:2]
int.vect  <- as.integer(1:3)
mydf      <- data.frame(x = 1:4, y = c(4:6, NA))
mymat     <- matrix(data = c(100:102, 2*100:102), nrow = 3, ncol = 2,
                    dimnames = list(paste0("row", 1:3), c("x", "y")))
mylist    <- list(char.vect = char.vect, int.vect = int.vect,
                  mydf = mydf, mymat = mymat)
print(mylist)

$char.vect
[1] "a" "b"

$int.vect
[1] 1 2 3

$mydf
  x  y
1 1  4
2 2  5
3 3  6
4 4 NA

$mymat
       x   y
row1 100 200
row2 101 202
row3 102 204


Elements can be accessed by their name or index.

mylist$mymat["row2", "y"]
[1] 202
mylist$mymat[2, 2]
[1] 202
mylist[[4]][2, 2]
[1] 202


The double square brackets are mandatory to access the root element of lists. Hence, mylist[4][2, 2] does not work:
Error in mylist[4][2, 2] : incorrect number of dimensions

Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

Complete thread:

UA Flag
Activity
 Admin contact
22,940 posts in 4,812 threads, 1,640 registered users;
34 visitors (0 registered, 34 guests [including 9 identified bots]).
Forum time: 05:56 CET (Europe/Vienna)

Those people who think they know everything
are a great annoyance to those of us who do.    Isaac Asimov

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5