I have been using which() to define a new variable inside mutate in the following way.
I have two data frames:
df1 <- data.frame(
telephone = c("1231231234", "2342342345", "3453453456", "1231231234"),
email = c("a@email.com", "b@email.com", "c@email.com", "d@mail.com")
)
df2 <- data.frame(
phone = c("1231231234", "2342342345", "3453453456")
)
What I am attempting to do is add a new variable to df2 that stores each row number in which the value of "phone" occurs in df1$telephone. I have attempted the following:
df2 <- df2 %>%
mutate(
phone_ind = str_flatten(which(df1$telephone == phone), collapse = ", ")
)
This yields the warning
Warning message: Problem while computing phone_ind = str_flatten(which(df1$telephone == phone), collapse = ", ")
.
ℹ longer object length is not a multiple of shorter object length
as well as the following obviously incorrect results:
df2 <- data.frame(
phone = c("1231231234", "2342342345", "3453453456"),
phone_ind = c("1, 2, 3, 4", "1, 2, 3, 4", "1, 2, 3, 4")
)
Any ideas? Thank you in advance for any insight.
library(dplyr) # v1.1.0
df2 %>%
left_join(transmute(df1, phone = telephone, row = row_number()), multiple = "all") %>%
summarize(row = paste(row, collapse = ", "), .by = phone)
Result
Joining with `by = join_by(phone)`
phone row
1 1231231234 1, 4
2 2342342345 2
3 3453453456 3