I've got data from an experiment with 2 conditions where each participant got tested in both conditions for various properties. The dataset is in wide format, the variable name pattern is like this: variables for condition A start with MM7M, for condition B they start with MMsf7M, then the abbreviation for the property is appended, e.g. MM7Mabc and MMsf7Mabc. I want to reshape the dataset to long format, so each participant gets 2 rows, one for condition A and one for condition B; there should be only one column for the condition, having the values A and B and for each property a column, named after the property, e.g. abc. Because there hundreds of properties, I would like to do the pivoting automatically for all variable pairs that follow the pattern condition+someproperty. Any help much appreciated.
EDIT for clarification: I meant creating a new variable for each property, i.e. abc, abcde, xyz, total_c, so there are only 2 rows for each participant.
library(tidyverse)
data <- tibble(
id = c(1, 2, 3, 4),
MM7Mabcde = c(2, 4, 2, 2),
MMsf7Mabcde = c(4, 6, 3, 4),
MM7Mxyz = c(2, 4, 4, 2),
MMsf7Mxyz = c(6, 8, 9, 9),
MM7Mtotal_c = c(6, 5, 4, 1),
MMsf7Mtotal_c = c(12, 18, 22, 32),
someVar = c(5, 6, 9, 9)
)
As long as your condition names have some regular expression in common, you can use names_pattern
, with the special name .value
in names_to
pivot_longer(data, contains("7M"),
names_pattern = "^(.*7M)(.*)$",
names_to = c("Condition", ".value"))
#> # A tibble: 8 x 6
#> id someVar Condition abcde xyz total_c
#> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 1 5 MM7M 2 2 6
#> 2 1 5 MMsf7M 4 6 12
#> 3 1 6 MM7M 4 4 5
#> 4 1 6 MMsf7M 6 8 18
#> 5 2 9 MM7M 2 4 4
#> 6 2 9 MMsf7M 3 9 22
#> 7 3 9 MM7M 2 2 1
#> 8 3 9 MMsf7M 4 9 32