rreshapetransposereshape2dataformat

In R, how to transpose row data to coulmn per variable?


I have some data like below

dataA=data.frame(structure(list(Season = c(2021, 2021, 2021, 2021, 2021, 2022, 
                          2022, 2022, 2022, 2022, 2023, 2023, 2023, 2023, 2023), 
               ID = c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E"), 
               S = c(44, 64, 65, 68, 58, 46, 48, 48, 62, 42, 49, 63, 75, 90, 55), 
               Na = c(115, 131, 153, 118, 140, 127, 108, 119, 122, 139, 136, 134, 170, 139, 178), 
               Ca = c(1.58, 2.41, 2.49, 2.25, 2.1, 1.21, 2.07, 2.66, 2.26, 1.9, 1.36, 2.18, 2.53, 2.22, 2.12)), 
               class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -15L)))

   Season ID  S  Na   Ca
1    2021  A 44 115 1.58
2    2021  B 64 131 2.41
3    2021  C 65 153 2.49
4    2021  D 68 118 2.25
5    2021  E 58 140 2.10
6    2022  A 46 127 1.21
7    2022  B 48 108 2.07
8    2022  C 48 119 2.66
9    2022  D 62 122 2.26
10   2022  E 42 139 1.90
11   2023  A 49 136 1.36
12   2023  B 63 134 2.18
13   2023  C 75 170 2.53
14   2023  D 90 139 2.22
15   2023  E 55 178 2.12

Now I'm using a certain program and it's required to re-arrange and upload data like below. I have huge datasets, and it seems not practical to copy and paste one by one. I believe there are some codes to transpose data format from rows to columns.

I used reshape::cast() but it doesn't work for multiple variables.

library (reshape)
dataB= reshape::cast(dataA, ID ~ Season, value=c("S","Na", "Ca"))
**
Error in data.frame(data[, c(variables), drop = FALSE], result = data$value) : 
  arguments imply differing number of rows: 15, 0
In addition: Warning message:
In names(data) == value :
  longer object length is not a multiple of shorter object length

Could you please let me know how to transpose data?

Thanks,

enter image description here


Solution

  • You can use tidyr::pivot_wider() with the names_glue argument:

    library(tidyr)
    
    dataA |>
      pivot_wider(names_from = Season, values_from = c(S, Na, Ca),
                  names_glue = "{Season}_{.value}")
    

    Output:

    # A tibble: 5 × 10
      ID    `2021_S` `2022_S` `2023_S` `2021_Na` `2022_Na` `2023_Na` `2021_Ca` `2022_Ca` `2023_Ca`
      <chr>    <dbl>    <dbl>    <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
    1 A           44       46       49       115       127       136      1.58      1.21      1.36
    2 B           64       48       63       131       108       134      2.41      2.07      2.18
    3 C           65       48       75       153       119       170      2.49      2.66      2.53
    4 D           68       62       90       118       122       139      2.25      2.26      2.22
    5 E           58       42       55       140       139       178      2.1       1.9       2.12