rr-factor

How to assign levels to a factor variable


I have a data set in R with a column yr_renovated that has either 0 or a integer (i.e 1998)for the year a house was renovated in. How would i create a factor variable with the levels yes and no for if the house was renovated or not.

head(House_Data$yr_renovated,n=20)
[1]    0    0    0    0    0    0    0    0    0    0    0    0 1998    0    0    0    0    0    0 

I was thinking of something along of lines of

levels(renovated)[levels(renovated) <= 0] <- "no"
levels(renovated)[levels(renovated) > 0] <- "yes"

but i saw this used online and i dont know how exactly this works, also i realised that if i make a mistake in the assignment of levels lets say

levels(renovated)[levels(renovated) <= 0] <- "yes"
levels(renovated)[levels(renovated) > 0] <- "yes"
levels(renovated)[levels(renovated) <= 0] <- "no"

the last levels will not override the first one my only level will be yes, how would i remove that first wrongly assigned level?

no  no  no  no  no  no  no  no  no  no  no  no  yes no  no  no  no  no  no  no 
Levels: no yes

This is what the final answer should look like or if using table()

renovated
  no  yes 
5762  238 

but sometimes it will give me this result

renovated
 Yes 
6000 

Excuse my rookie knowledge of R, we haven't worked much on R during our statistics module in university so far


Solution

  • You can do it using factor and assigning desired labels:

    yr_renovated <- c(0, 0, 1998, 0, 2010, 0)
    
    renovated <- factor(yr_renovated == 0, labels=c("Yes", "No"))
    table(renovated)
    
    #> renovated
    #> Yes  No 
    #>   2   4