javascriptrd3.jsr2d3

Reading in external data in d3 JavaScript - an R r2d3 Use Case


EDIT: link for all data/code used in example: https://drive.google.com/open?id=16MpDptwV7m4nOkoT3ImlKffl4rYqc5ms

Hello friends and roasters alike,

I'm about as novice as can be with D3 visualization. My background is all in Plotly and integrated R platform plots. I have written very very light js/css for Shiny apps, but I'm trying to branch out into more custom and free visual methods.

So I've been diving through the r2d3 package for d3 integration in R. I've searched through all of the examples and pored through whatever documentation I could find in the master repo and overview pages here: https://rstudio.github.io/r2d3/articles/gallery/calendar/

But, for the life of me I simply can't wrap my head around how the js is actually pulling in the data

An example here: the visual, following by the script that produces it, and finally the csv provided as the data source to visualize

Visual: Visual Constructed

calendar.js script:

// !preview r2d3 data = read.csv("dji-latest.csv"), d3_version = 4, 

container = "div", options = list(start = 2006, end = 2011)

// Based on https://bl.ocks.org/mbostock/4063318

var height = height / (options.end - options.start),
    cellSize = height / 8;

var formatPercent = d3.format(".1%");

var color = d3.scaleQuantize()
    .domain([-0.05, 0.05])
    .range(["#a50026", "#d73027", "#f46d43", "#fdae61", "#fee08b", "#ffffbf", "#d9ef8b", "#a6d96a", "#66bd63", "#1a9850", "#006837"]);

var svg = div
  .style("line-height", "0")
  .selectAll("svg")
  .data(d3.range(options.start, options.end))
  .enter().append("svg")
    .attr("width", width)
    .attr("height", height)
  .append("g")
    .attr("transform", "translate(" + cellSize * 3.5 + "," + (height - cellSize * 7 - 1) + ")");

svg.append("text")
    .attr("transform", "translate(-" + (6 * height / 60) + "," + cellSize * 3.5 + ")rotate(-90)")
    .attr("font-family", "sans-serif")
    .attr("font-size", 2 + 8 * height / 60)
    .attr("text-anchor", "middle")
    .text(function(d) { return d; });

var rect = svg.append("g")
    .attr("fill", "none")
    .attr("stroke", "#ccc")
    .attr("stroke-width", "0.25")
  .selectAll("rect")
  .data(function(d) { return d3.timeDays(new Date(d, 0, 1), new Date(d + 1, 0, 1)); })
  .enter().append("rect")
    .attr("width", cellSize)
    .attr("height", cellSize)
    .attr("x", function(d) { return d3.timeWeek.count(d3.timeYear(d), d) * cellSize; })
    .attr("y", function(d) { return d.getDay() * cellSize; })
    .datum(d3.timeFormat("%Y-%m-%d"));

svg.append("g")
    .attr("fill", "none")
    .attr("stroke", "#000")
    .attr("stroke-width", "0.25")
  .selectAll("path")
  .data(function(d) { return d3.timeMonths(new Date(d, 0, 1), new Date(d + 1, 0, 1)); })
  .enter().append("path")
    .attr("d", pathMonth);

r2d3.onRender(function(csv, div, width, height, options) {
  var data = d3.nest()
      .key(function(d) { return d.Date; })
      .rollup(function(d) { return (d[0].Close - d[0].Open) / d[0].Open; })
    .object(csv);

  rect.filter(function(d) { return d in data; })
      .attr("fill", function(d) { return color(data[d]); })
    .append("title")
      .text(function(d) { return d + ": " + formatPercent(data[d]); });
});

function pathMonth(t0) {
  var t1 = new Date(t0.getFullYear(), t0.getMonth() + 1, 0),
      d0 = t0.getDay(), w0 = d3.timeWeek.count(d3.timeYear(t0), t0),
      d1 = t1.getDay(), w1 = d3.timeWeek.count(d3.timeYear(t1), t1);
  return "M" + (w0 + 1) * cellSize + "," + d0 * cellSize
      + "H" + w0 * cellSize + "V" + 7 * cellSize
      + "H" + w1 * cellSize + "V" + (d1 + 1) * cellSize
      + "H" + (w1 + 1) * cellSize + "V" + 0
      + "H" + (w0 + 1) * cellSize + "Z";
}

And this is the .csv fed in

screenshot of .csv pulled from github

And I know this is completely a source of my own understanding of js function call and data handling, but this is simply stumping me to no end. I can see some .data inits and function calls within, but no where do I find any indication of what this visualization is supposed to catch. How does it know which of the columns denotes the dates? Where is the variable specified to actually visualize?

Any inkling of help here would be immensely appreciated. I've gotten some d3 tutorials on my horizon, but any pointers can at least get me playing with the sandboxes those smarter than I have built :)

Thank you!


Solution

  • I know it is an old post... but I ended up here and I think it may be a good idea to write something for further reference.

    How does it know which of the columns denotes the dates? Where is the variable specified to actually visualize?

    This example is a bit tricky (or at least misleading) for beginners, but the piece of code specifying the variables/dates is this one:

    var data = d3.nest()
          .key(function(d) { return d.Date; })
          .rollup(function(d) { return (d[0].Close - d[0].Open) / d[0].Open; })
        .object(csv);
    

    You can see what exactly d3.nest does here. In a nutshell, R passes the data variable (named csv in the js side) to js by translating the table in dji-latest.csv to a js-friendly object, like (in R syntax):

    data <- list(
      list(Date = "2010-10-01", Open = 10789.72, High = ...),
      list(Date = "2010-09-30", Open = 10789.72, High = ...),
    ) 
    

    The specific variables are then selected via d.Dates, d.Close and d.Open in key and rollup function definitions.

    Note that csv in the function above refers to data passed from R, because it is the first argument in the function inside r2d3.onRender by default, and that may be the source of confusion. In the js side csv is fed to nest to produced the nested data object required for this specific visualization.

    As others have said, it is hard to give a better explanation than reading the docs and this example is pretty straightforward.