I'm working on the following dataframe:
subject group_type session_code exposure beta_early_X beta_early_Y beta_early_Z gamma_peak_X
1 X1 GRP1 A LOW-CNT -6.4 -3.5 10.2 -7.7
2 X1 GRP1 A LOW-EXP -5.9 -3.8 11.4 -5.1
3 X1 GRP1 A HIGH-EXP -2.1 1.1 12.8 -4.3
4 X2 GRP1 A LOW-CNT 1.2 3.9 14.1 -2.5
5 X2 GRP1 A LOW-EXP 1.7 5.2 14.5 -5.9
6 X2 GRP1 A HIGH-EXP 5.6 8.1 15.9 -1.7
7 X3 GRP1 A LOW-CNT 0.2 -0.9 2.9 3.9
8 X3 GRP1 A LOW-EXP -2.9 -2.5 0.5 2.8
9 X3 GRP1 A HIGH-EXP -2.3 -1.8 3.0 2.3
10 X4 GRP1 A LOW-CNT -3.1 3.6 12.7 -1.0
gamma_peak_Y gamma_peak_Z
1 -3.8 7.2
2 -3.0 8.6
3 -1.9 8.9
4 -1.3 4.4
5 -2.7 4.6
6 0.4 8.4
7 2.8 6.7
8 1.9 4.9
9 2.5 4.4
10 4.7 12.1
If I would like to split the notation X, Y from the variable name where they appear to create a new variable named differently wherein these notation shouldd be listed, what should I do?
Thanks in advance
Here dataset
structure(list(
subject = c("X1", "X1", "X1", "X2", "X2", "X2", "X3", "X3", "X3", "X4"),
group_type = c("GRP1", "GRP1", "GRP1", "GRP1", "GRP1", "GRP1", "GRP1", "GRP1", "GRP1", "GRP1"),
session_code = c("A", "A", "A", "A", "A", "A", "A", "A", "A", "A"),
exposure = c("LOW-CNT", "LOW-EXP", "HIGH-EXP", "LOW-CNT", "LOW-EXP", "HIGH-EXP", "LOW-CNT", "LOW-EXP", "HIGH-EXP", "LOW-CNT"),
beta_early_X = c(-6.4, -5.9, -2.1, 1.2, 1.7, 5.6, 0.2, -2.9, -2.3, -3.1),
beta_early_Y = c(-3.5, -3.8, 1.1, 3.9, 5.2, 8.1, -0.9, -2.5, -1.8, 3.6),
beta_early_Z = c(10.2, 11.4, 12.8, 14.1, 14.5, 15.9, 2.9, 0.5, 3.0, 12.7),
gamma_peak_X = c(-7.7, -5.1, -4.3, -2.5, -5.9, -1.7, 3.9, 2.8, 2.3, -1.0),
gamma_peak_Y = c(-3.8, -3.0, -1.9, -1.3, -2.7, 0.4, 2.8, 1.9, 2.5, 4.7),
gamma_peak_Z = c(7.2, 8.6, 8.9, 4.4, 4.6, 8.4, 6.7, 4.9, 4.4, 12.1)
), class = "data.frame", row.names = c(NA, -10L))
May be we need pivot_longer
to reshape from 'wide' to 'long' Specify the cols
with column names that have the (
followed by digits (\\d+
), split at the delimiter (.
by specifying the names_sep
. The names_to
with .value
will keep the column values before the .
and the new column 'electrode' keeps the suffix part of the column name after the .
library(dplyr)
library(tidyr)
data %>%
pivot_longer(cols = matches("\\(\\d+-\\d+"),
names_to = c(".value", "electrode"), names_sep = "\\.")
-output
# A tibble: 24 × 9
ID GR SES COND electrode `P3(400-450)` `LPPearly(500-700)` `LPP1(500-1000)` `LPP2(1000-1500)`
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 01 RP V NEG-CTR FCz -11.6 -11.8 -5.67 -0.199
2 01 RP V NEG-CTR Cz -5.17 -5.96 -0.774 2.96
3 01 RP V NEG-CTR Pz 11.9 8.24 9.99 6.28
4 01 RP V NEG-CTR POz NA NA NA 7.91
5 01 RP V NEG-NOC FCz -11.1 -9.15 -4.39 -3.16
6 01 RP V NEG-NOC Cz -5.53 -5.11 -0.650 -2.13
7 01 RP V NEG-NOC Pz 12.1 9.51 11.1 5.25
8 01 RP V NEG-NOC POz NA NA NA 9.95
9 01 RP V NEU-NOC FCz -4.00 -7.58 -2.97 0.896
10 01 RP V NEU-NOC Cz 0.622 -2.82 1.14 2.95
# … with 14 more rows