My apologies if the above isn't a good wording, or if this has already been covered - I have tried to search more than once.
I want each element of a new dataframe column, b, to be that row's value of another column (e.g. col1 or col2) which is named in yet another column, a. I'd love to have code for this that is reasonably readable. I have a very un-tidyverse, unreadable for-loop solution, and an example failed attempt of how I imagine this should look.
Here is my example input:
library(tidyverse)
df1 = tibble(a=c("col1","col1","col2"),
col1=c(1,3,5),
col2=c(2,4,6))
df1
## A tibble: 3 × 3
# a col1 col2
# <chr> <dbl> <dbl>
#1 col1 1 2
#2 col1 3 4
#3 col2 5 6
and the goal output:
b=c(1,3,6)
df2=cbind(df1,b)
df2
# a col1 col2 b
#1 col1 1 2 1
#2 col1 3 4 3
#3 col2 5 6 6
I'm aware that I could use e.g. mutate and case_when for this version with only two possibilities col1 and col2, but I need this to generalise to many of these "value columns" without listing each possible column name explicitly.
Here is the working-but-horrible solution I have:
for (i in c(1:nrow(df1))) {
df1$b[i] = df1[[{{df1$a[i]}}]][i]
}
df1
# for the ith row: extracts the text of a which is a column name, embraces this to treat as tidyselect
# extracts that entire column of the dataframe as a vector with [[]], finds the ith value of the vector, assigns as value of b for that row.
This feels like it can't possibly be the best way. Here are my failed attempts at a cleaner version. This runs and returns a copy of a in b:
df3 = mutate(df1, "b" = {{a}})
df3
And this fails with the below error:
df3 = mutate(df1, "b" = [[{{a}}]])
df3
Error: unexpected '[[' in "df3 = mutate(df1, "b" = [["
Any solutions or advice (ideally in tidyverse) will be gratefully received. Thank you!
Edit: I did look through the suggestions here, and forgot to go back and specify that I did not find them "readable" - the focus of my question, since I already had a working-but-inscrutable solution. Jon Spring's answer, for me, fits that bill.
I think a join to a longer version of the data might be simplest. First, we can add a row number:
df2 <- df1 |> rowid_to_column()
Now, let's make a longer version of the table, with one row per rowid-column combination.
df3 <- df2 |> select(-a) |> pivot_longer(col1:col2, values_to = "b")
# assumes tidyr (or tidyverse meta-package, which includes it) is loaded
## A tibble: 6 × 3
# rowid name b
# <int> <chr> <dbl>
#1 1 col1 1
#2 1 col2 2
#3 2 col1 3
#4 2 col2 4
#5 3 col1 5
#6 3 col2 6
Then we join the original table to the long table, matching a
to name
, plus rowid
.
df2 |> left_join(df3, join_by(a == name, rowid))
# A tibble: 3 × 5
rowid a col1 col2 b
<int> <chr> <dbl> <dbl> <dbl>
1 1 col1 1 2 1
2 2 col1 3 4 3
3 3 col2 5 6 6