I'm curious as to what the easiest way to label a data.table frequency table is. For example, say I have a data.table (dt1) with the following column (animals) being c(1,1,2,2,2,2,3,3,3,3,3,3)
and I use:
dt2 <- dt1[, .N ,by = animals]
to get a frequency table (dt2):
animals N
1: 1 2
2: 2 4
3: 3 6
what's the most elegant way to label (rename) the animal column if
1 = turtle
2 = horse
3 = cat
4 = dog
This is easy to do by reference if there is no 4/dog:
dt2[animals == c(1:3), animalnames := c("turtle", "horse", "cat")]
However, this presents two issues:
dt2[animals == c(1:4), animalnames := c("turtle", "horse", "cat", "dog")]
Error in .prepareFastSubset(isub = isub, x = x, enclos = parent.frame(), :
RHS of == is length 4 which is not 1 or nrow (3). For robustness, no recycling is allowed (other than of length 1 RHS). Consider %in% instead.
What's the simplest solution if you: a) Want to relabel the "animals" column b) want to be able to use a list of possible labels that can, but may not exist in a given sample (i.e. another sample might contain dog, and another might not contain cat, but I want to use the same code for all samples)
Thanks!
You could use fcase
:
dt2[,animals := fcase( animals == 1, 'turtle',
animals == 2, 'horse',
animals == 3, 'cat',
animals == 4, 'dog',
rep(TRUE,.N), 'unknown')]
dt2
# animals N
# <char> <int>
#1: turtle 2
#2: horse 4
#3: cat 6