rfor-loopr-formula

Use of Tilde (~) and period (.) in R


I'm going over looping with tidyverse and purrr using Hadley's R4DS book and am a little confused as to the exact usage of the tilde ~ symbol and period symbol.

So when writing for loops, or using map(), instead of writing out function(), it appears you can use the tilde symbol instead ~.

Does this only apply to for loops?

so as below...

models <- mtcars %>% 
  split(.$cyl) %>% 
  map(~lm(mpg ~ wt, data = .))

Additionally, the period i was told can be used "to refer to the current list element". But I am confused what that means. Does that mean, that only when looping, the period means it refers to the element in the list that is being looped over? How is it different from piping? When you pipe, you are piping the result of one line to the next line of code.

So in the case above, mtcars is piped to the second line with split() but a period is used. Why?

The case below sums up my confusion:

x <- c(1:10)

detect(x, ~.x > 5)

using the detect function, which finds the first match, I thought i could just use

detect(x, x >5)

but I get an error saying x >5 is not a function. So i add a tilde

detect(x, ~ x > 5)

and get an error sayingt it expects a single TRUE or FALSE, not 10. So if you add a period

detect(x, ~.x >5) 

suddenly it works as looping. So what is the relation/ usage of ~ and . here and how does . compare to simple piping?


Solution

  • This overall is known as tidyverse non-standard evaluation (NSE). You probably found out that ~ also is used in formulas to indicate that the left hand side is dependent on the right hand side.

    In tidyverse NSE, ~ indicates function(...). Thus, these two expressions are equivalent.

    x %>% detect(function(...) ..1 > 5)
    #[1] 6
    
    x %>% detect(~.x > 5)
    #[1] 6
    

    ~ automatically assigns each argument of the function to the .; .x, .y; and ..1, ..2 ..3 special symbols. Note that only the first argument becomes ..

    map2(1, 2, function(x,y) x + y)
    #[[1]]
    #[1] 3
    
    map2(1, 2, ~.x + .y)
    #[[1]]
    #[1] 3
    
    map2(1, 2, ~..1 + ..2)
    #[[1]]
    #[1] 3
    
    map2(1, 2, ~. + ..2)
    #[[1]]
    #[1] 3
    
    map2(1, 2, ~. + .[2])
    #[[1]]
    #[1] NA
    

    This automatic assignment can be very helpful when there are many variables.

    mtcars %>% pmap_dbl(~ ..1/..4)
    # [1] 0.19090909 0.19090909 0.24516129 0.19454545 0.10685714 0.17238095 0.05836735 0.39354839 0.24000000 0.15609756
    #[11] 0.14471545 0.09111111 0.09611111 0.08444444 0.05073171 0.04837209 0.06391304 0.49090909 0.58461538 0.52153846
    #[21] 0.22164948 0.10333333 0.10133333 0.05428571 0.10971429 0.41363636 0.28571429 0.26902655 0.05984848 0.11257143
    #[31] 0.04477612 0.19633028
    

    But in addition to all of the special symbols I noted above, the arguments are also assigned to .... Just like all of R, ... is sort of like a named list of arguments, so you can use it along with with:

    mtcars %>% pmap_dbl(~ with(list(...), mpg/hp))
    # [1] 0.19090909 0.19090909 0.24516129 0.19454545 0.10685714 0.17238095 0.05836735 0.39354839 0.24000000 0.15609756
    #[11] 0.14471545 0.09111111 0.09611111 0.08444444 0.05073171 0.04837209 0.06391304 0.49090909 0.58461538 0.52153846
    #[21] 0.22164948 0.10333333 0.10133333 0.05428571 0.10971429 0.41363636 0.28571429 0.26902655 0.05984848 0.11257143
    #[31] 0.04477612 0.19633028
    

    An other way to think about why this works is because data.frames are just a list with some row names:

    a <- list(a = c(1,2), b = c("A","B"))
    a
    #$a
    #[1] 1 2
    #$b
    #[1] "A" "B"
    attr(a,"row.names") <- as.character(c(1,2))
    class(a) <- "data.frame"
    a
    #  a b
    #1 1 A
    #2 2 B