rreshape

How to reshape a PSPP table into another form of table


I Have a Dataframe that is Mean Tables by PSPP. I would like to reshape it in order to manipulate it easier for plots in calc.

What I want to do?

  1. This Table contains descriptive statistics such as Mean, SD, N.
  2. The levels of Categorical variables are populated vertically.
    V1 V1_levelA, V1_levelB, | V2 V2_levelA, V2_levelB ... etc.
  3. Descriptive statistics are vertically displayed

I would like the first variable to be populated horizontally, and the next one vertically. Please see, the attached image for more information.

The result must take into consideration that Table may missing entire factor levels - cause they may have not "values", and therefore not included in the input table in the form of csv.

I hope this huge editing be clearer now what I am asking.

Dput sample df similar to that of the posted image:

    df <-   structure(list(structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "v1", "v2"), class = "factor"), 
    varA = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 1L, 1L, 
    2L, 3L, 3L, 4L, 4L), .Label = c("k1", "k2", "k3", "k4", "varA"
    ), class = "factor"), Age = structure(c(1L, 2L, 1L, 2L, 1L, 
    2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("a1", 
    "a2", "Age"), class = "factor"), Mean = structure(1:15, .Label = c("10", 
    "11", "12", "13", "14", "15", "16", "17", "18", "19", "21", 
    "22", "23", "24", "25", "Mean"), class = "factor"), N = structure(c(1L, 
    8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 2L, 3L, 4L, 5L, 6L, 
    7L), .Label = c("1", "10", "12", "13", "14", "15", "16", 
    "2", "3", "4", "5", "6", "7", "8", "9", "N"), class = "factor")), row.names = 2:16, class = "data.frame")

*Update** Check that: My input and the desired Output: https://postimg.cc/N2GTZd09 enter image description here


Solution

  • I am still unclear about your expected output as your input data and expected output don't match.

    That aside, perhaps this is what you're after?

    library(tidyverse)
    df %>%
        rename(group = 1) %>%                # Name first column
        mutate_at(1, na_if, "") %>%          # Replace "" with NA
        fill(group) %>%                      # Fill first column with missing values
        group_by(group) %>%
        nest() %>%                           # Nest data by group
        mutate(data = map(data, ~.x %>%
            gather(k, v, -varA, -Age) %>%    # Wide to long
            unite(k, varA, k) %>%            # Unite varA with variable column
            spread(k, v))) %>%               # Spread from long to wide
        unnest()                             # Unnest
    ## A tibble: 4 x 10
    #  group Age   k1_Mean k1_N  k2_Mean k2_N  k3_Mean k3_N  k4_Mean k4_N
    #  <fct> <fct> <chr>   <chr> <chr>   <chr> <chr>   <chr> <chr>   <chr>
    #1 v1    a1    10      1     12      3     14      5     16      7
    #2 v1    a2    11      2     13      4     15      6     17      8
    #3 v2    a1    18      9     NA      NA    22      13    24      15
    #4 v2    a2    19      10    21      12    23      14    25      16