In an answer to another question, @Marek posted the following solution: https://stackoverflow.com/a/10432263/636656
dat <- structure(list(product = c(11L, 11L, 9L, 9L, 6L, 1L, 11L, 5L,
7L, 11L, 5L, 11L, 4L, 3L, 10L, 7L, 10L, 5L, 9L, 8L)), .Names = "product", row.names = c(NA, -20L), class = "data.frame")
`levels<-`(
factor(dat$product),
list(Tylenol=1:3, Advil=4:6, Bayer=7:9, Generic=10:12)
)
Which produces as output:
[1] Generic Generic Bayer Bayer Advil Tylenol Generic Advil Bayer Generic Advil Generic Advil Tylenol
[15] Generic Bayer Generic Advil Bayer Bayer
This is just the printout of a vector; so to store it, you can do the even more confusing:
res <- `levels<-`(
factor(dat$product),
list(Tylenol=1:3, Advil=4:6, Bayer=7:9, Generic=10:12)
)
Clearly this is some kind of call to the levels function, but I have no idea what's being done here. What is the term for this kind of sorcery, and how do I increase my magical ability in this domain?
The answers here are good, but they are missing an important point. Let me try and describe it.
R is a functional language and does not like to mutate its objects. But it does allow assignment statements, using replacement functions:
levels(x) <- y
is equivalent to
x <- `levels<-`(x, y)
The trick is, this rewriting is done by <-
; it is not done by levels<-
. levels<-
is just a regular function that takes an input and gives an output; it does not mutate anything.
One consequence of that is that, according to the above rule, <-
must be recursive:
levels(x)[1] <- "a"
is
levels(x) <- `[<-`(levels(x), 1, "a")
is
x <- `levels<-`(x, `[<-`(levels(x), 1, "a"))
It's kind of beautiful that this pure-functional transformation (up until the very end, where the assignment happens) is equivalent to what an assignment would be in an imperative language. This construct in functional languages is called a lens. Lenses can be awkward to use in some programming languages, but in R they just work.
But then, once you have defined replacement functions like levels<-
, you get another, unexpected windfall: you don't just have the ability to make assignments, you have a handy function that takes in a factor, and gives out another factor with different levels. There's really nothing "assignment" about it!
So, the code you're describing is just making use of this other interpretation of levels<-
. I admit that the name levels<-
is a little confusing because it suggests an assignment, but this is not what is going on. The code is simply setting up a sort of pipeline:
Start with dat$product
Convert it to a factor
Change the levels
Store that in res
Personally, I think that line of code is beautiful ;)