Say that I have these data:
library(dplyr)
df1 <- data.frame(x = c(1, 2, 3, 4), z = c("A", "A", "B", "B"))
df2 <- data.frame(x = c(2, 4, 6, 8), z = c("A", "A", "B", "C"))
I can easily check if each element of x
in df1
is present in x
of df2
:
df1 <- df1 %>% mutate(present = x %in% df2$x)
Is there an easy way to do the same thing (preferable in the tidyverse
), but to only check within group?
In other words, for an observation in df1
to have present
be TRUE
, two things must be true: 1) the group (z
) in df2
must be the same as the group in df1
and 2) the value of x
in df2
must be the same as the value in df1
.
So, only the second observation (2
) would be TRUE
because there exists an observation in df2
with an x
of 2
and a z
of A
. The last observation of x
would be FALSE
because even though there is a value in df2
with value 4
, this observation is in group A
, not B
.
This works on your example data, though it seems inelegant.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df1 <- data.frame(x = c(1, 2, 3, 4), z = c("A", "A", "B", "B"))
df2 <- data.frame(x = c(2, 4, 6, 8), z = c("A", "A", "B", "C"))
df1 |> rowwise() |> mutate(present = x %in% df2[df2$z == z, "x"])
#> # A tibble: 4 × 3
#> # Rowwise:
#> x z present
#> <dbl> <chr> <lgl>
#> 1 1 A FALSE
#> 2 2 A TRUE
#> 3 3 B FALSE
#> 4 4 B FALSE
Created on 2024-11-30 with reprex v2.1.1