rdplyrfilterends-with

How to select all columns where one column has rows that end with '006'


I'm using dplyr and I want to select all the columns on the table but return only the rows where one specific column ends with '006'.

select(sample_id, ends_with("006"), everything())

The code above doesn't work. When I run it, it returns all rows (or more than I need -- it's a huge dataset).

I've tried using:

filter(sample_id == ends_with('006')) 

but ends_with() needs to be used within a select function.


Solution

  • Use str_ends from package stringr:

    df %>% filter(str_ends(sample_id, "006"))
    

    By default the pattern is a regular expression. You can match a fixed string with:

    df %>% filter(str_ends(sample_id, fixed("006")))
    

    Of course it's also possible to use a more general regular expression. It's useful if you have a more complex pattern to check, but it also works here:

    df %>% filter(str_detect(sample_id, "006$")) 
    

    See also: Detect the presence or absence of a pattern at the beginning or end of a string.