Unfortunately, I could not find a suitable simple solution.
Suppose I have a dataframe like this
set.seed(1)
data <- data.frame(ID = 1:20, value = sample(20))
ranges <- c(0, 5, 10, 15)
I want to create a dummy variable for each successive range, as in the following example:
data %>% mutate(dummy_0_5= if_else(value >= ranges[1] & value < ranges[2] , 1,0),
dummy_5_10 = if_else(value >= ranges[2] & value < ranges[3] , 1,0)).
The next dummies should be between ranges[3] and ranges[4].
However, ranges could be stored in other ways (df, list, vector, etc). The most important feature for me is that I want to test different ranges, so I don't want to adjust the column name each time I switch to the correct ranges.
Assuming that you need only three dummy groups, and leaving those value >= 15
(I set it to NA) , a solution could be using cut
to create a dummy factor and binding data to table
output:
set.seed(1)
data <- data.frame(ID = 1:20, value = sample(20))
library(dplyr)
data<-data %>%
mutate(dummy=cut(value, breaks = c(0, 5, 10, 15), labels = FALSE, right = F))
data<-cbind(data %>% arrange(value),with(data, table(value, dummy)) %>%
as.data.frame.matrix()) %>%
rename(dummy_0_5=c(4), dummy_5_10 = c(5), dummy_10_15 = c(6))
data
ID value dummy dummy_0_5 dummy_5_10 dummy_10_15
1 3 1 1 1 0 0
2 4 2 1 1 0 0
3 10 3 1 1 0 0
4 1 4 1 1 0 0
5 12 5 2 0 1 0
6 15 6 2 0 1 0
7 2 7 2 0 1 0
8 20 8 2 0 1 0
9 13 9 2 0 1 0
10 18 10 3 0 0 1
11 7 11 3 0 0 1
12 17 12 3 0 0 1
13 5 13 3 0 0 1
14 9 14 3 0 0 1
15 16 15 NA 0 0 0
16 14 16 NA 0 0 0
17 8 17 NA 0 0 0
18 11 18 NA 0 0 0
19 6 19 NA 0 0 0
20 19 20 NA 0 0 0