I have sample data and I want to filter the number of id's
never had sup
status while on type ==N
, meaning I only choose id with status == unsup
while before switching type, and then the number id's who switched from N to P
.
for example id==1
never had status==sup
while on type==N
, So I need to count id 1. Then I want to check this id also whether switching to P
. But id 2 is not eligible for selected because it have sup
status while on type==N
.
id's 2, 5, and id 7 will not be eligible for as they had status == sup
, while on status N
and id 7 was on NA
only while on N
.
data <- data.frame(id=c(1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,6,6,6,6,6,7,7,7),
type=c('N','N','N', 'N', 'P','P','N','N','N', 'I', 'I','N','N','N',
'N', 'N','N','N','N', 'O', 'O','N','N','N', 'O','N','N','N', 'P', 'P', 'N','N','P'),
status=c(NA,'unsup',NA,'unsup',NA,'sup',NA,NA,'sup',NA,'sup','unsup',NA,'unsup',NA,
'unsup','unsup',NA,'unsup',NA,'sup','sup',NA,NA,'unsup',NA,'unsup','unsup','unsup','sup', NA, NA, 'sup'))
Expected output
1.
id type status
1 1 N <NA>
2 1 N unsup
3 1 N <NA>
4 1 N unsup
5 1 P <NA>
6 1 P sup
7 3 N unsup
8 3 N <NA>
9 3 N unsup
10 3 N <NA>
11 3 N unsup
12 4 N unsup
13 4 N <NA>
14 4 N unsup
15 4 O <NA>
16 4 O sup
17 6 N <NA>
18 6 N unsup
19 6 N unsup
20 6 P unsup
21 6 P sup
Then of which, id's switched to P are:
id type status
1 1 N <NA>
2 1 N unsup
3 1 N <NA>
4 1 N unsup
5 1 P <NA>
6 1 P sup
7 6 N <NA>
8 6 N unsup
9 6 N unsup
10 6 P unsup
11 6 P sup
For the first case, after grouping by 'id', filter
any 'id's not having status
value as 'sup' and type
as 'N' and those ids having any non-NA value for status
where type
is 'N'
library(dplyr)
data1 <- data %>%
group_by(id) %>%
filter((!any((status %in% 'sup' & type == 'N'), na.rm = TRUE))&
any(!is.na(status[type == "N"]))) %>%
ungroup
-output
data1
# A tibble: 21 × 3
id type status
<dbl> <chr> <chr>
1 1 N <NA>
2 1 N unsup
3 1 N <NA>
4 1 N unsup
5 1 P <NA>
6 1 P sup
7 3 N unsup
8 3 N <NA>
9 3 N unsup
10 3 N <NA>
# … with 11 more rows
From the subset data, we can filter
again after checking for any
case where there is a type
value of 'N' and the next value (lead
) is 'P' for each 'id'
data1 %>%
group_by(id) %>%
filter(any(type== "N" & lead(type) == "P", na.rm = TRUE)) %>%
ungroup
# A tibble: 11 × 3
id type status
<dbl> <chr> <chr>
1 1 N <NA>
2 1 N unsup
3 1 N <NA>
4 1 N unsup
5 1 P <NA>
6 1 P sup
7 6 N <NA>
8 6 N unsup
9 6 N unsup
10 6 P unsup
11 6 P sup