javascriptcssd3.js

D3 Stacked Bar Chart Example Mechanism Specifics


I'm currently trying to make a stacked bar chart in D3. I've found the following links with examples: https://d3js.org/d3-shape/stack#_stack , https://observablehq.com/@d3/stacked-bar-chart/2 , https://d3-graph-gallery.com/graph/barplot_stacked_basicWide.html .

While I think I can just copy paste my data into this format and have it work, I'm not sure on the specifics of several things & would love to know the actual inner mechanisms, and Google isn't able to understand my question.

  1. The simple example creates the stack with the following code:
var stackedData = d3.stack()
        .keys(subgroups)
        (data)

I'm trying to figure out what that (data) is doing at the end: as in, it's not parameters being passed into any method's parentheses, because all the methods' parentheses have already been opened and closed, so where is it getting passed to if its parentheses are not attached to a method? If there's just a link to the documentation on this or something that would be great.

  1. The example of the rectangle generation the official doc gives is:
svg.append("g")
  .selectAll("g")
  .data(series)
  .join("g")
    .attr("fill", d => color(d.key))
  .selectAll("rect")
  .data(D => D)
  .join("rect")
    .attr("x", d => x(d.data[0]))
    .attr("y", d => y(d[1]))
    .attr("height", d => y(d[0]) - y(d[1]))
    .attr("width", x.bandwidth());

2a) This isn't that consequential but I'm just curious - why is it convention to use capital D in .data(D => D) instead of the usual lowercase d?

2b) In x(d.data[0]), what is "data"? I've used Inspect Element on the Observable example and found that it does contain a data array inside its __data__, but I'm not sure how it got there. Is it because of how the index() function used before it works? As in, usually, there is no array called "data" inside d, right?

2c) Why is height calculated by y(d[0]) - y(d[1]), not the other way around? d[1] is bigger than d[0] for each item, so that makes it negative. Is it something to do with the y axis increasing as it goes down on a webpage? But how can height be negative, regardless?

Thank you!!


Solution

  • 1) What (data) is doing at the end?

    (data) is invoking the "stack generator function" returned by var stackedData = d3.stack().keys(subgroups) through "function currying".

    It's equivalent to:

    var stackFunction = d3.stack().keys(subgroups);
    var stackedData = stackFunction(data);
    
    // stackedData result will be something like: 
    // [[ [0, 10], [0, 15] ], // subgroups[0]
    // [ [10, 30], [15, 40] ]] // subgroups[1]
    

    Couldn't find the explicit documentation on MDN, but here is a starting reference on currying: https://builtin.com/software-engineering-perspectives/currying-javascript

    2a) Why capital D instead of lowercase d?

    You are correct that this is not consequential for the code. I believe this is purely a naming preference, especially to differentiate it with the d inside the .attr below.

    2b) What is 'data' in x(d.data[0])?

    You are correct that this has something to do with the previous index() function in series.

    To understand this, you'll need to refer to the previous code in the documentation you provided (https://d3js.org/d3-shape/stack#_stack). The 'data' here refers to data inside d (i.e. d.data).

    First, the series is created by:

    const series = d3.stack()
    .keys(d3.union(data.map(d => d.fruit))) // apples, bananas, cherries, …
    .value(([, group], key) => group.get(key).sales)
    (d3.index(data, d => d.date, d => d.fruit));
    

    Then, the data is associated with the d3 on:

    .data(D=>D)
    

    Each d will have 2 numbers: the stack lower bound d[0] and the stack upper bound d[1]; D3 also attached the original .data property based on the series where it is an outer Map keyed by date. The data structure after this association with d3 (for each d) will be:

    [0, //bottom of the stack for the bar segment for Apple 
    3840, //top of the stack for the bar segment for Apple
    data:[new Date("2015-01-01"), Map {"apples" => {date:..., fruit: "apples", sales: 3840}, "bananas"=>{...}, "cherries"=>{...},"durians"=>{...}}}]
    ]
    

    Therefore, the d.data[0] should be "date".

    Here is the reference to the index method data structure: https://www.geeksforgeeks.org/d3-js-index-method/

    2c) Why is the height calculated by y(d[0]) - y(d[1])?

    This is because D3 is using SVG coordinate that is similar to canvas coordinate. The top left is (0,0) as shown here: (https://developer.mozilla.org/en-US/docs/Web/SVG/Tutorials/SVG_from_scratch/Positions). Higher value is lower on the screen while lower value is higher on the screen, thus the inversion is required.

    This also depends on the D3 y scale setting. For example, we can set:

    y = d3.scaleLinear().domain([0, 100]).range([400, 0])
    // y = d3.scaleLinear().domain([dataMin,dataMax]).range([chartHeight, 0])
    // this domain and range index can be flipped/inverted as desired.
    

    The value 0 corresponds to screen (canvas) Y of 400px (lower side of SVG)
    The value 100 corresponds to screen (canvas) Y of 0px (higher side of SVG)

    If we have:

    d[0] = 20
    d[1] = 50
    

    Passed to the y function, it will be:

    y(d[0]) - y(d[1]) == 120 //true
    //equivalent to y(20) - y(50)
    //equivalent to (400 - (20/100 * 400))-(400 - (50/100 * 400))
    //subtraction from 400 is required since our range goes down from 400 to 0
    

    It is correct that d[0] < d[1], but y(d[0]) > y (d[1]), where the mapping inversion is defined by the y = d3.scaleLinear().domain([0, 100]).range([400, 0])