rordinals

Using an IF statement to create an ordinal variable


I'm new to R and have a question concerning a project of mine.

I have a variable, Age.Range from an imported dataset (od) about overdoses. The variable Age.Range contains these levels:

15-19, 20-24, 25-29, 30-39, 40-49, 50-59, 60-69, 70-79

I want to create a new, ordinal variable representative of Age.Range, such that 15-19 will be represented as 1, 20-24 will be represented 2, 25-29 will be represented as 3, and so on and so forth.

In SAS my code would look like this:

if Age.Range="15-19" then AgeOrdinal=1;
else if Age.Range="20-24" then AgeOrdinal=2

if Age.Range="20-24" then AgeOrdinal=3;
else if Age.Range="24-29" then AgeOrdinal=4

if Age.Range="30-39" then AgeOrdinal=5;
else if Age.Range="40-49" then AgeOrdinal=6

etc.

Can I do a similar thing in R? If so, how? Thanks!

P.S., I know how to create a dummy variable like

od$SurviveYes <- ifelse(od$Survive=="Y", 1, 0)

But I would like to have a variable with more than two levels.

So far, this is my poor attempt:

> od$AgeOrdinal <- c()
> age <- function(od$Age.Range){
>   sapply(od$Age.Range, function(x) if(x == "15-19") 1 
+          else if (x == "20-24") 2 
+          else if (x == "25-29") 3
+          else if (x == "30-39") 4
+          else if (x == "40-49") 5
+          else if (x == "50-59") 6
+          else if (x == "60-69") 7
+          else (x == "70-79") 8
> }

Thank you in advance!


Solution

  • Is it this what you're looking for?

    # create a mock of your data
    x <- c("15-19", "20-24", "25-29", "30-39", "40-49", "50-59", "60-69", "70-79")
    od <- data.frame(Age.Range = sample(x, 100, replace = TRUE))
    
    
    # create ageordinal
    od$AgeOrdinal <- as.integer(factor(od$Age.Range))
    
    od
    

    Note that this works only because the levels of the factor (see levels(factor(od$Age.Range)) are already sorted.

    If you add a new level like 9-14 this will not work as expected. In that case you need to change your code like this:

    # create a mock of your data
    x <- c("9-14", "15-19", "20-24", "25-29", "30-39", "40-49", "50-59", "60-69", "70-79")
    od <- data.frame(Age.Range = sample(x, 100, replace = TRUE))
    
    # create ageordinal
    od$AgeOrdinal <- as.integer(factor(od$Age.Range, levels = x, ordered = TRUE))
    
    od
    

    PS: when you create a data.frame, R converts already every char column into factors. So technically, in the first example you didn't need to transform it into a factor. In the second example, you have to call the function factor since you need to change the order of the levels.