I would like to add a new column to my dataframe based on two other columns. The data looks as follows:
df
job honorary
yes yes
yes no
no yes
yes yes
yes NA
NA no
Now I would like a third column that contains "both" if job and honorary are yes, "honorary" if only the column honorary contains a yes, "job" if only the column job contains a yes, and NA if both contain NA or one column contains NA and the other no. The third column should look like this:
result
both
job
honorary
both
job
NA
I have tried code with if and mutate but I am quite new to R and my codes do not work at all. If I assign values singularly like:
data_nature_fewmissing$urbandnat[data_nature_fewmissing$nature =="yes" & data_nature_fewmissing$urbangreen =="yes"] <- "yes"
It is not working because in every step I overwrite the results from before.
Thanks for your help!
I like case_when
from dplyr
for these types of complex conditionals.
df<-tibble::tribble(
~job, ~honorary,
"yes", "yes",
"yes", "no",
"no", "yes",
"yes", "yes",
"yes", NA,
NA, "no"
)
library(dplyr)
df_new <- df %>%
mutate(result=case_when(
job=="yes" & honorary=="yes" ~ "both",
honorary=="yes" ~ "honorary",
job=="yes" ~ "job",
is.na(honorary) & is.na(job) ~ NA_character_,
is.na(honorary) & job=="no" ~ NA_character_,
is.na(job) & honorary=="no" ~ NA_character_,
TRUE ~ "other"
))
df_new
#> # A tibble: 6 × 3
#> job honorary result
#> <chr> <chr> <chr>
#> 1 yes yes both
#> 2 yes no job
#> 3 no yes honorary
#> 4 yes yes both
#> 5 yes <NA> job
#> 6 <NA> no <NA>
or in base R
df_new<-df
df_new=within(df_new,{
result=NA
result[ honorary=="yes"] = "honorary"
result[ job=="yes"] = "job"
result[job=="yes" & honorary=="yes"]='both'
})
Created on 2022-01-16 by the reprex package (v2.0.1)