I'm trying to write a code that will allow me to create a TRUE
or FALSE
variable within the groups name
depending on the value of the earliest record of the column poped of the following data.frame
:
library(tidyverse)
name<-c("AAA","AAA","AAA","AAA","AAA","AAA","AAA")
poped<-c(NA,1,NA,NA,1,NA,NA)
order<-c(1:7)
tag<-c("X","Y","X","X","Y","X","X")
> df
name order tag poped
1 AAA 1 X NA
2 AAA 2 Y 1
3 AAA 3 X NA
4 AAA 4 X NA
5 AAA 5 Y 1
6 AAA 6 X NA
7 AAA 7 X NA
I want to mutate a two new variable named CHECK
and POS
CHECK
will take on the values
1= If the closest (above) value where the tag column is Y and poped is 1
0= If the closest (above) value where the tag column is Y and poped is 0
2 = If the current row has tag = Y
NA = Otherwise
POS
will take on the value of the closest (above) row number where the tag column is Y and poped is 1, and NA
otherwise.
My desired output will be:
> df
name order tag poped CHECK POS why
1 AAA 1 X NA NA NA There is no previous data
2 AAA 2 Y 1 NA NA current tag = Y
3 AAA 3 X NA 1 2 the closest value above where tag=Y is in row 2 and poped is 1
4 AAA 4 X NA 1 2 the closest value above where tag=Y is in row 2 and poped is 1
5 AAA 5 Y 1 NA NA current tag = Y
6 AAA 6 X NA 1 5 the closest value above where tag=Y is in row 5 and poped is 1
7 AAA 7 X NA 1 5 the closest value above where tag=Y is in row 5 and poped is 1
How can I create a solution, ideally using Tidyverse?
df %>%
mutate(ctag=if_else(tag=="Y",tag,as.character(NA)),
cpop=if_else(tag=="Y",poped,as.double(NA)),
maxr=if_else(tag=="Y" & poped==1,order,as.integer(NA))) %>%
fill(ctag,cpop,maxr) %>%
mutate(
CHECK = case_when(
tag == "Y"~2,
lag(ctag) == "Y" & lag(cpop)==1 ~1,
lag(ctag) == "Y" & lag(cpop)==0 ~0,
TRUE~as.double(NA)),
POS = if_else(tag=="Y", as.integer(NA), maxr)
) %>%
select(!ctag:maxr)
Output:
name order tag poped CHECK POS
<chr> <int> <chr> <dbl> <dbl> <int>
1 AAA 1 X NA NA NA
2 AAA 2 Y 1 2 NA
3 AAA 3 X NA 1 2
4 AAA 4 X NA 1 2
5 AAA 5 Y 1 2 NA
6 AAA 6 X NA 1 5
7 AAA 7 X NA 1 5