rdplyrgroup-by

Selecting a group of rows if one of them has a desired attribute


I have a data frame like this:

data <- data.frame(family.id=rep(1:6, each=5), 
                   member.id=rep(1:5, times=6), 
                   attribute=c(1, 1, 2, 3, 4, 1, 2, 2, 3, 5, 1, 1, 3, 5, 5, 2, 
                               3, 3, 4, 5, 1, 1, 4, 4, 5, 3, 3, 4, 5, 5))

I want to select all household members, if at least one of them has an attribute equal to 4. please tell me what should i do?

I prefer to use the "group_by" command from dplyr.


Solution

  • library(tidyverse)
    
    data %>% 
          filter(any(attribute == 4), .by = family.id)
    

    or

    data %>% 
      group_by(family.id) %>% 
      filter(any(attribute == 4)) %>% 
      ungroup()
    
       family.id member.id attribute
    1          1         1         1
    2          1         2         1
    3          1         3         2
    4          1         4         3
    5          1         5         4
    6          4         1         2
    7          4         2         3
    8          4         3         3
    9          4         4         4
    10         4         5         5
    11         5         1         1
    12         5         2         1
    13         5         3         4
    14         5         4         4
    15         5         5         5
    16         6         1         3
    17         6         2         3
    18         6         3         4
    19         6         4         5
    20         6         5         5