I have a dataset where there are text responses from multiple surveys. The responses were done using a Likert scale but the text wasn't standardized. For example:
#create df
df<- data.frame(
id = c('person1','person2','person3'),
category = c('I am 0-10 years old', 'I am 11-20 years old', 'I am between 21-30 years old'),
Q1.do.you.feel.tired.everyday = c('no, never', 'yes, sometimes', 'yes some-times'))
Question 1: how do I mutate the string 'yes, some-times' to 'yes, sometimes'
Question 2: how can I change the text for my category column? I want to get rid of the word "between" so how can I change 'I am between 21-30 years old' to be 'I am 21-30 years old'
I wanted to make the answers to Q1 factors so I used:
df<- mutate(df, across(where(is.character), as.factor))
However, 'yes, sometimes' and 'yes some-times' appear as two different levels. So that column is a factor with 3 levels, rather than 2.
library(dplyr)
df |>
mutate(category = gsub("between ", "", category, fixed = TRUE),
Q1.do.you.feel.tired.everyday = ifelse(Q1.do.you.feel.tired.everyday == "yes some-times", "yes, sometimes", Q1.do.you.feel.tired.everyday),
across(where(is.character), factor))
# id category Q1.do.you.feel.tired.everyday
# 1 person1 I am 0-10 years old no, never
# 2 person2 I am 11-20 years old yes, sometimes
# 3 person3 I am 21-30 years old yes, sometimes