rreplacenotin

R - change values if not found in list


I am working with a column of employment data. I want to end with the following values:

I have cleaned up all the different iterations of all the values except for employed. I am trying to craft a statement that would do something along the lines of:

If not in this list "Unemployed | Retired | Self-Employed | Disabled" change value to "Employed".

I have been attempting the use of the %notin% function and the replace() function but am missing something. Any help pointing me in the correct direction would be greatly appreciated.

UPDATE/EDIT:

I got code to work based on the suggestion from @Rui Barradas, but when cleaning up and notating the code I broke something and I can't for the life of me figure out what I am doing wrong. The code below does not throw an error but it is not changing the values to 'Employed' when I verify with table(df7$patient_employment)

`%notin%` <- Negate(`%in%`)
x <- c(df7$patient_employment, "Unemployed", "Retired", "Self-Employed", "Disabled")
x[x %notin% df7$patient_employment] <- "Employed"

RESOLVED:

After some additional help it was pointed out that I was utilizing x from the example when I should have been utilizing my data names. Being working on this for too long. Time to stretch my legs. Thank you @Rui Barradas


Solution

  • See if the following answers the question.

    `%notin%` <- Negate(`%in%`)
    
    set.seed(2020)
    status <- c("Unemployed", "Retired", "Self-Employed", "Disabled")
    x <- sample(c(status, "Employed", "ABC"), 20, TRUE)
    
    i <- x %notin% status
    x[i]
    #[1] "ABC"      "ABC"      "Employed" "ABC"      "Employed"
    #[6] "Employed"
    
    x[i] <- "Employed"
    x[i]
    #[1] "Employed" "Employed" "Employed" "Employed" "Employed"
    #[6] "Employed"
    

    The code above is simple enough to not need the logical index vector i. This vector was created to make the code more readable but the following is equivalent to the code above.

    x[x %notin% status] <- "Employed"
    

    After the OP's comment, instead of x use df7$patient_employment and it should work.

    df7$patient_employment[df7$patient_employment %notin% c("Unemployed", "Retired", "Self-Employed", "Disabled")] <- "Employed"