rbinningcategorizationbins

Categorize numeric variable into group/ bins/ breaks


I am trying to categorize a numeric variable (age) into groups defined by intervals so it will not be continuous. I have this code:

data$agegrp(data$age >= 40 & data$age <= 49) <- 3
data$agegrp(data$age >= 30 & data$age <= 39) <- 2
data$agegrp(data$age >= 20 & data$age <= 29) <- 1

the above code is not working under survival package. It's giving me:

invalid function in complex assignment

Can you point me where the error is? data is the dataframe I am using.


Solution

  • I would use findInterval() here:

    First, make up some sample data

    set.seed(1)
    ages <- floor(runif(20, min = 20, max = 50))
    ages
    # [1] 27 31 37 47 26 46 48 39 38 21 26 25 40 31 43 34 41 49 31 43
    

    Use findInterval() to categorize your "ages" vector.

    findInterval(ages, c(20, 30, 40))
    # [1] 1 2 2 3 1 3 3 2 2 1 1 1 3 2 3 2 3 3 2 3
    

    Alternatively, as recommended in the comments, cut() is also useful here:

    cut(ages, breaks=c(20, 30, 40, 50), right = FALSE)
    cut(ages, breaks=c(20, 30, 40, 50), right = FALSE, labels = FALSE)