rdplyrtidyverse

Filter data based on an external column using dplyr


I have two dataframes loaded in R. One dataframe contains one column called LC with 56 unique observations, the second data frame contains many columns including LC column with 500 observations. For simplicity, I have recreated these two dataframes using some sample data below.

# DF 1
LC <-  c("A", "B", "C")
lc.dc <- data.frame(LC)
LC
  LC
1  A
2  B
3  C

# DF 2
DA <-  c(44, 22, 56,20, 34, 45, 22, 55)
LC <-  c("A","C","W", "Z", "B","H","B","A")
da.df <- data.frame(DA, LC)

da.df
  DA LC
1 44  A
2 22  C
3 56  W
4 20  Z
5 34  B
6 45  H
7 22  B
8 55  A

I'd like to filter column LC present da.df using unique observations found in column LC present in DF lc.dc. Observations under column LC on both DFs are the same.

The steps I have take to perform this include:

# Convert Unique LC to list
LC.char <- lc.dc$LC
StringLC <- list(LC.char)

# Apply filter using dplyr to fetch a new DF with only the unqique LCs
da.df.filtered <- da.df %>% 
  filter(LC. %in% c(StringLC))

The end result is an empty DF. How do I perform this filter so that the end result is a da.df.filtered with only unique values of LC present in the lc.dc dataframe?


Solution

  • Is this what you are looking for?

    > da.df %>% 
        filter(LC %in% lc.dc$LC)
      DA LC
    1 44  A
    2 22  C
    3 34  B
    4 22  B
    5 55  A