I am looking to standardize cleaning exports from Survey Monkey at my organization, and want to rename the column name to (Column name + first row name) if the first row is not NA.
Edit: This would ideally be implemented in a function/loop so that it would work on data frames of varying sizes, without having to edit any parameters.
Reprex:
df <- tribble(
~`Which of these choices do you like`, ~`...1`, ~`...2`, ~`...3`, ~`Respondent ID`, ~`Different Text`, ~`...4`,
'Fruit', 'Drink', 'Dessert', 'Snack', NA, 'Pizza Topping', 'Pizza Style',
'Apple', 'Water', 'Pie', 'Oreos', 1234, 'Mushroom', 'Deep Dish',
'Apple', 'Coffee', 'Cake', 'Granola', 1235, 'Onion', 'NY Style',
'Banana', 'Coffee', 'Pie', 'Oreos', 1236, 'Mushroom', 'NY Style',
'Pear', 'Vodka', 'Pie', 'Granola', 1237, 'Onion', 'Deep Dish'
)
After the columns are renamed, I would delete the first row and carry on with my life.
Ideally, my df would look like this:
Thank you for any guidance!
In base R
, we can use paste
and then remove the first row
names(df)[1:4] <- paste0(names(df)[1], unlist(df[1, 1:4]))
df <- df[-1, ]
Or using sprintf
names(df)[1:4] <- sprintf("%s (%s)", names(df)[1], unlist(df[1, 1:4]))
df <- df[-1,]
If we want to do this by checking the NA elements
library(dplyr)
library(tidyr)
library(purrr)
library(stringr)
keydat <- df %>%
slice(1) %>%
select_if(negate(is.na)) %>%
pivot_longer(everything()) %>%
group_by(grp = cumsum(!startsWith(name, "..."))) %>%
mutate(value = sprintf("%s (%s)", first(name), value)) %>%
ungroup %>%
select(-grp)
df <- df %>%
rename_at(vars(keydat$name), ~ keydat$value) %>%
slice(-1)
df
# A tibble: 4 x 7
# `Which of these… `Which of these… `Which of these… `Which of these… `Respondent ID`
# <chr> <chr> <chr> <chr> <dbl>
#1 Apple Water Pie Oreos 1234
#2 Apple Coffee Cake Granola 1235
#3 Banana Coffee Pie Oreos 1236
#4 Pear Vodka Pie Granola 1237
# … with 2 more variables: `Different Text (Pizza Topping)` <chr>, `Different Text (Pizza
# Style)` <chr>
names(df)
#[1] "Which of these choices do you like (Fruit)" "Which of these choices do you like (Drink)"
#[3] "Which of these choices do you like (Dessert)" "Which of these choices do you like (Snack)"
#[5] "Respondent ID" "Different Text (Pizza Topping)"
#[7] "Different Text (Pizza Style)"