rconditional-statementsmultiple-columnsaddition

add new column based on two other columns with several conditions, character


I would like to add a new column to my dataframe based on two other columns. The data looks as follows:

df
job    honorary  

yes    yes
yes    no
no     yes
yes    yes
yes    NA
NA     no

Now I would like a third column that contains "both" if job and honorary are yes, "honorary" if only the column honorary contains a yes, "job" if only the column job contains a yes, and NA if both contain NA or one column contains NA and the other no. The third column should look like this:

result

both
job
honorary
both
job
NA

I have tried code with if and mutate but I am quite new to R and my codes do not work at all. If I assign values singularly like:

data_nature_fewmissing$urbandnat[data_nature_fewmissing$nature =="yes" & data_nature_fewmissing$urbangreen =="yes"] <- "yes"

It is not working because in every step I overwrite the results from before.

Thanks for your help!


Solution

  • I like case_when from dplyr for these types of complex conditionals.

    df<-tibble::tribble(
       ~job, ~honorary,
      "yes",     "yes",
      "yes",      "no",
       "no",     "yes",
      "yes",     "yes",
      "yes",        NA,
         NA,      "no"
      )
    
    library(dplyr)
    
    df_new <- df %>%
      mutate(result=case_when(
        job=="yes" & honorary=="yes" ~ "both",
        honorary=="yes" ~ "honorary", 
        job=="yes" ~ "job", 
        is.na(honorary) & is.na(job) ~ NA_character_, 
        is.na(honorary) & job=="no" ~ NA_character_, 
        is.na(job) & honorary=="no" ~ NA_character_, 
        TRUE ~ "other"
      ))
    
    df_new
    #> # A tibble: 6 × 3
    #>   job   honorary result  
    #>   <chr> <chr>    <chr>   
    #> 1 yes   yes      both    
    #> 2 yes   no       job     
    #> 3 no    yes      honorary
    #> 4 yes   yes      both    
    #> 5 yes   <NA>     job     
    #> 6 <NA>  no       <NA>
    

    or in base R

    
    df_new<-df
    
    df_new=within(df_new,{
      result=NA
      result[ honorary=="yes"] = "honorary"
      result[ job=="yes"] = "job"
      result[job=="yes" & honorary=="yes"]='both'
    })
    

    Created on 2022-01-16 by the reprex package (v2.0.1)