I was trying to assign labels to columns of multiple data frames. I have more than 10 data frames I wanted to manipulate but here are some examples:
df1 = tribble(
~a_age, ~a01edu, ~other_vars,
35, 17, 1,
41, 14, 2,
28, 12, 3,
68, 99, 4
)
df2 = tribble(
~b_age, ~b01edu, ~some_vars,
25, 10, 2,
52, 8, 1,
31, 20, 5
)
df3 = tribble(
~c_age, ~c01edu,
55, 16,
47, 11,
68, 16,
36, 6,
29, 16
)
Each data frame has certain columns that have simliar names such as a...some_name
, b...some_name
and so on. I tried using labelled::set_variable_labels()
to create column labels for one data frame, and it worked fine.
df1 = df1 |> labelled::set_variable_labels(
.labels = list("a_age" = "Age",
"a01edu" = "Highest education completed")
)
Output:
Then I tried using purrr::pmap()
to assign column labels to all data frames at once but it gave me an error.
df_list = list(df1, df2, df3) |> setNames(c("a", "b", "c"))
params = tribble(
~x, ~y, ~z,
"a", "a_age", "a01edu",
"b", "b_age", "b01edu",
"c", "c_age", "c01edu"
)
pmap(params,
function(x, y, z) {
df_list[[x]] |> labelled::set_variable_labels(
.labels = list(y = "Age",
z = "Highest education completed")
)
}
)
The error message
<error/rlang_error>
Error in `pmap()`:
ℹ In index: 1.
Caused by error in `var_label<-.data.frame`:
! some variables not found in x:y, z
---
Backtrace:
1. purrr::pmap(...)
2. purrr:::pmap_("list", .l, .f, ..., .progress = .progress)
5. global .f(x = .l[[1L]][[i]], y = .l[[2L]][[i]], z = .l[[3L]][[i]], ...)
6. labelled::set_variable_labels(...)
8. labelled:::`var_label<-.data.frame`(`*tmp*`, value = .labels)
9. base::stop("some variables not found in x:", missing_names)
Why am I getting this error? I thought I set up the params
object correctly so that the column names in df_list
match the ones I'm feeding into the function function(x, y, z)
. I'm pretty sure there are better ways to achieve what I'm trying to do. Any help would be very much appreciated. Thank you!
It is just that the =
wouldn't allow lhs to be evaluated. We may use :=
with dplyr::lst
library(dplyr)
library(purrr)
df_list2 <- pmap(params, ~ df_list[[..1]] |>
labelled::set_variable_labels(
.labels = lst(!!..2 := "Age",
!! ..3 := "Highest education completed")
)
)
-output
[[1]]
# A tibble: 4 × 3
a_age a01edu other_vars
<dbl> <dbl> <dbl>
1 35 17 1
2 41 14 2
3 28 12 3
4 68 99 4
[[2]]
# A tibble: 3 × 3
b_age b01edu some_vars
<dbl> <dbl> <dbl>
1 25 10 2
2 52 8 1
3 31 20 5
[[3]]
# A tibble: 5 × 2
c_age c01edu
<dbl> <dbl>
1 55 16
2 47 11
3 68 16
4 36 6
5 29 16
> str(df_list2)
List of 3
$ : tibble [4 × 3] (S3: tbl_df/tbl/data.frame)
..$ a_age : num [1:4] 35 41 28 68
.. ..- attr(*, "label")= chr "Age"
..$ a01edu : num [1:4] 17 14 12 99
.. ..- attr(*, "label")= chr "Highest education completed"
..$ other_vars: num [1:4] 1 2 3 4
$ : tibble [3 × 3] (S3: tbl_df/tbl/data.frame)
..$ b_age : num [1:3] 25 52 31
.. ..- attr(*, "label")= chr "Age"
..$ b01edu : num [1:3] 10 8 20
.. ..- attr(*, "label")= chr "Highest education completed"
..$ some_vars: num [1:3] 2 1 5
$ : tibble [5 × 2] (S3: tbl_df/tbl/data.frame)
..$ c_age : num [1:5] 55 47 68 36 29
.. ..- attr(*, "label")= chr "Age"
..$ c01edu: num [1:5] 16 11 16 6 16
.. ..- attr(*, "label")= chr "Highest education completed"