I'm new to R and have a question concerning a project of mine.
I have a variable, Age.Range from an imported dataset (od) about overdoses. The variable Age.Range contains these levels:
15-19, 20-24, 25-29, 30-39, 40-49, 50-59, 60-69, 70-79
I want to create a new, ordinal variable representative of Age.Range, such that 15-19 will be represented as 1, 20-24 will be represented 2, 25-29 will be represented as 3, and so on and so forth.
In SAS my code would look like this:
if Age.Range="15-19" then AgeOrdinal=1;
else if Age.Range="20-24" then AgeOrdinal=2
if Age.Range="20-24" then AgeOrdinal=3;
else if Age.Range="24-29" then AgeOrdinal=4
if Age.Range="30-39" then AgeOrdinal=5;
else if Age.Range="40-49" then AgeOrdinal=6
etc.
Can I do a similar thing in R? If so, how? Thanks!
P.S., I know how to create a dummy variable like
od$SurviveYes <- ifelse(od$Survive=="Y", 1, 0)
But I would like to have a variable with more than two levels.
So far, this is my poor attempt:
> od$AgeOrdinal <- c()
> age <- function(od$Age.Range){
> sapply(od$Age.Range, function(x) if(x == "15-19") 1
+ else if (x == "20-24") 2
+ else if (x == "25-29") 3
+ else if (x == "30-39") 4
+ else if (x == "40-49") 5
+ else if (x == "50-59") 6
+ else if (x == "60-69") 7
+ else (x == "70-79") 8
> }
Thank you in advance!
Is it this what you're looking for?
# create a mock of your data
x <- c("15-19", "20-24", "25-29", "30-39", "40-49", "50-59", "60-69", "70-79")
od <- data.frame(Age.Range = sample(x, 100, replace = TRUE))
# create ageordinal
od$AgeOrdinal <- as.integer(factor(od$Age.Range))
od
Note that this works only because the levels of the factor (see levels(factor(od$Age.Range)
) are already sorted.
If you add a new level like 9-14 this will not work as expected. In that case you need to change your code like this:
# create a mock of your data
x <- c("9-14", "15-19", "20-24", "25-29", "30-39", "40-49", "50-59", "60-69", "70-79")
od <- data.frame(Age.Range = sample(x, 100, replace = TRUE))
# create ageordinal
od$AgeOrdinal <- as.integer(factor(od$Age.Range, levels = x, ordered = TRUE))
od
PS: when you create a data.frame, R converts already every char column into factors. So technically, in the first example you didn't need to transform it into a factor. In the second example, you have to call the function factor
since you need to change the order of the levels.