rdplyrtidyrdatabase-management

Reorganizing columns by two column combination


I am currently learning the tidyr and dplyr. Went in the following issue I am not sure how to appropiatly face:

Imaging the following dataset:

Factor 1    Factor 2        Year    value
A            green          2016     1.2
A            green          2017     1.9
B            yellow         2017      3
B            yellow         2018      8

An trying to obtain:

Factor 1    Factor 2     Year.2016   Year.2017  Year.2018
A            green          1.2          1.9        NA           
B            yellow         NA            3          8

I have basic R knowledge in this aspect and tried several options using default R fuctions but withouth results


Solution

  • library(dplyr)
    library(tidyr)
    
    # example data
    dt = read.table(text = "
    Factor1    Factor2    Year    value
    A            green       2016    1.2
    A            green       2017    1.9
    B            yellow      2017    3
    B            yellow      2018    8
    ", header=T)
    
    dt %>% spread(Year, value, sep=".")
    
    #   Factor1 Factor2 Year.2016 Year.2017 Year.2018
    # 1       A   green       1.2       1.9        NA
    # 2       B  yellow        NA       3.0         8
    

    In case you have two or more value columns you can use this approach that involves a little bit more reshaping:

    library(dplyr)
    library(tidyr)
    
    # example data
    dt = read.table(text = "
    Factor1    Factor2    Year    value  value2
    A            green       2016    1.2   5
    A            green       2017    1.9   5
    B            yellow      2017    3     5
    B            yellow      2018    8     5
    ", header=T)
    
    dt %>% 
      gather(v, value, -Factor1, -Factor2, -Year) %>%
      unite(Year, Year, v) %>%
      spread(Year, value, sep = ".")
    
    #   Factor1 Factor2 Year.2016_value Year.2016_value2 Year.2016_value3 Year.2017_value
    # 1       A   green             1.2                5                9             1.9
    # 2       B  yellow              NA               NA               NA             3.0
    #   Year.2017_value2 Year.2017_value3 Year.2018_value Year.2018_value2 Year.2018_value3
    # 1                5                9              NA               NA               NA
    # 2                5                9               8                5                9