rsankey-diagramnetworkd3

How to plot Sankey Graph with R networkD3 values and percentage below each node


Good afternoon, from the code below I am able to produce a Graph chart but it does not show the underlying values. I tried to tweak the code in this thread but I got no joy. I never used Java.

What I need is a graph that has also the values and the percentage under each node like the following picture.

enter image description here

Thanks

library(dplyr)
library(networkD3)
library(tidyverse)
library(readxl)
library(RColorBrewer)

df = data.frame(Source = c("ABC","CDE","MNB","PCI","UCD"),
                 Destination = c("Me","You","Him","Her","Her"),
                 Value = c(200,350,456,450,100))


## Reshape dataframe to long
df2 = pivot_longer(df, c(Destination, Source))

## make unique list for destination and source
dest = unique(as.character(df$Destination))
sources = unique(as.character(df$Source))

## Assign nodes number to each element of the chart
nodes2 = data.frame(node = append(dest,sources), nodeid = c(0:8))
res = merge(df,nodes2, by.x="Source", by.y = "node")
res = merge(res,nodes2, by.x="Destination", by.y = "node")

## Make links
links2 = res[, c("nodeid.x","nodeid.y","Value")]
colnames(links2) <- c("source", "target", "value")

## Add a 'group' column to each connection:
links2$group = as.factor(c("type_a","type_b","type_c","type_d","type_e"))


## defining nodes
nodes2["groups"] = nodes2$node
nodes2$groups = as.factor(nodes2$groups)



# Give a color for each group:
my_color <- 'd3.scaleOrdinal() .domain(["type_a","type_b","type_c","type_d","type_e","Me","You","Him","Her","Her"]) .range(["rgb(165,0,38,0.4)",    "rgb(215,48,39, 0.4)",  "rgb(244,109,67,0.4)",  "rgb(253,174,97,0.4)",  "rgb(254,224,139,0.4)",
"rgb(255,255,191,0.4)", "rgb(217,239,139,0.4)", "rgb(166,217,106,0.4)", 
                            "rgb(102,189,99,0.4)","rgb(26,152,80,0.4)"])'


# plot graph
networkD3::sankeyNetwork(Links = links2, Nodes = nodes2, 
                         Source = 'source', 
                         Target = 'target', 
                         Value = 'value', 
                         NodeID = 'node',
                         units = 'Amount',
                         colourScale=my_color,
                         LinkGroup="group", 
                         NodeGroup="groups", 
                         fontFamily = "arial", 
                         fontSize = 8,
                         nodeWidth = 8)

Solution

  • Update below original content; it is a fully developed solution to your original request.

    I'm still working on rendering the string with multiple lines (instead of on one line). However, it's proving to be quite difficult as SVG text. However, here is a method in which you can get all of the desired information onto your diagram, even if it isn't styled exactly as you wished.

    First I created the data to add to the plot. This has to be added to the widget after it's created. (It will just get stripped if you try to add it beforehand.)

    This creates the before and after percentages and the aggregated sums (where needed).

    # for this tiny data frame some of this grouping is redundant---
    # however, this method could be used on a much larger scale
    df3 <- df %>%
      group_by(Source) %>%
      mutate(sPerc = paste0(round(sum(Value) / sum(df$Value) * 100, 2), "%")) %>% 
      group_by(Destination) %>% 
      mutate(dPerc = paste0(round(sum(Value) / sum(df$Value) * 100, 2), "%")) %>% 
      pivot_longer(c(Destination, Source)) %>% 
      mutate(Perc = ifelse(name == "Destination",
                           dPerc, sPerc)) %>%  # determine which % to retain
      select(Value, value, Perc) %>%           # only fields to add to widget
      group_by(value, Perc) %>% 
      summarise(Value = sum(Value)) # get the sum for 'Her'
    

    I saved the Sankey diagram with the object name plt. This next part adds the new data to the widget plt.

    plt$x$nodes <- right_join(plt$x$nodes, df3, by = c("name" = "value"))
    

    This final element adds the value and the percentages to the source and destination node labels.

    htmlwidgets::onRender(plt, '
                          function(el, x) {
                            d3.select(el).selectAll(".node text")
                              .text(d => d.name + " " + d.Perc + " " + d.Value)
                          }')
    

    enter image description here



    Update: Multi-line labels

    I guess I just needed to sleep on it. This update will get you multi-line text.

    You also asked for resources on how you would go about doing this yourself. There are a few things at play here: Javascript, SVG text, D3, and the package htmlwidgets. When you use onRender, it's important to know the script file that that connects the package R code to the package htmlwidgets. I would suggest starting with learning about htmlwidgets. For example, how to create your own.

    Alright-- back to answering the original question. This appends the new values using all of the content I originally provided, except the call to onRender.

    htmlwidgets::onRender(plt, '
                          function(el, x) {
                            d3.select(el).selectAll(".node text").each(function(d){
                              var arr, val, anc
                              arr = " " + d.Perc + " " + d.Value;
                              arr = arr.split(" ");
                              val = d3.select(this).attr("x");
                              anc = d3.select(this).attr("text-anchor"); 
                              for(i = 0; i < arr.length; i++) {
                                d3.select(this).append("tspan")
                                    .text(arr[i])
                                    .attr("dy", i ? "1.2em" : 0)
                                    .attr("x", val)
                                    .attr("text-anchor", anc)
                                    .attr("class", "tspan" + i);
                              }
                            })
                          }')
    

    enter image description here