rdataframereshaperiverplot

Creating edges for a riverplot


I am hoping to use the riverplot package to create a flow diagram. This package needs 'edges' which are flows between levels. I want to create an edges data structure from a data frame. By way of example here is some code to create my input data.

rp.df<-structure(list(ID = 1:20, X1 = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "A1", class = "factor"), 
X2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A2", 
"B2"), class = "factor"), X3 = structure(c(1L, 1L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
3L), .Label = c("A3", "B3", "C3"), class = "factor")), class = "data.frame", row.names = c(NA, 
-20L))
table(rp.df$X1,rp.df$X2)
table(rp.df$X2,rp.df$X3)

with this output

> table(rp.df$X1,rp.df$X2)

     A2 B2
  A1 12  8
> table(rp.df$X2,rp.df$X3)

     A3 B3 C3
  A2  2  2  8
  B2  5  2  1

what I need is a dataframe with the flows identified in the tables, eg:

N1 N2 Value
A1 A2    12
A1 B2     8
A2 A3     2
A2 B3     2
A2 C3     8
B2 A3     5
B2 B3     2
B2 C3     1

In reality I have 10 columns of edges and 16k in flows. I have tried using reshape2 to do this but struggled.


Solution

  • Here's a base R solution, generalized for however many columns you have.

    out <- lapply(2:(ncol(rp.df) - 1), function(i) {
      as.data.frame(table(rp.df[, i], rp.df[, i + 1]))
      }
    )
    setNames(do.call(rbind, out), c("N1", "N2", "Value"))
    #   N1 N2 Value
    # 1 A1 A2    12
    # 2 A1 B2     8
    # 3 A2 A3     2
    # 4 B2 A3     5
    # 5 A2 B3     2
    # 6 B2 B3     2
    # 7 A2 C3     8
    # 8 B2 C3     1