d3.jsdc.jscrossfilter

Group by multiple dimensions recursively in dc.js?


dc.js has been great, and now I'm trying to understand how to use it for data with multiple dimensions.

I have time series data (csv), which contains the number of people that fit a certain attribute on a given day - e.g. the number of brown-haired people age 65+. A simplified version of it looks like this (There are 5 options for hair color, 5 for age group, and about 200 dates):

Date, Hair Color, 0-18, 19-39, 40-64, 65+
1/1/21, Brown, 5, 3, 10, 2
1/1/21, Blonde, 15, 2, 4, 1
1/2/21, Brown, 2, 8, 0, 2
1/2/21, Blonde, 11, 6, 7, 4
...

I'd like to be able to plot the cumulative counts over time for each sub-population. The complication is that I'd like to show

  1. A plot aggregated by hair color

All hair colors plot mockup

(so summing over all age groups), which can then be toggled (ideally by clicking on one of the lines) to show:

  1. A plot for a given hair color

Individual hair color plot mockup

disaggregated by age group.

(Note that in the mockups, I'm normalizing counts to show it as a cumulative percentage. I've been doing that calculation straightforwardly with valueAccessors.)

My question is: how do I create the dimensions and groups to create these plots?
I'd prefer not to create individual variables for each age group (I'd like it to be generic enough to expand to finer categories). But I'm having trouble understanding how to use reduce and filters to achieve my desired outcome.

Also, should I be doing it all as linecharts in a compositeChart, or in a series chart? There is the added wrinkle that I plan to then annotate the chart with extra trendlines added in from d3.

Thanks!


Solution

  • The series chart is a convenience class that generates a composite chart underneath.

    It allows you to specify your data using a 2D key, where one component is the key to be used for the X values in the chart, and one component is another key to be used for splitting the data into multiple layers - lines, in your case. You also give it the "prototype" of the layer chart, in the form of a function that returns a partially-initialized chart.

    It sounds like you are on the right track, so I won't attempt to give a complete answer, just a few hints. Please feel free to follow up in the comments, and I will edit this answer to fill in details.

    Flattening the data

    You will probably want to flatten your data so that there is only one value per row, i.e. structure it with an Age column and a Value column. This is a general best practice for working with crossfilter.

    It's possible to work with the data as you have it, but

    Using multikeys and series chart

    Following the series chart example, you might define your dimension as

    const colorDateDimension = cf.dimension(d => [d['Hair Color'], d.Date]);
    

    Now any group on this dimension will aggregate by both hair color and date.

    Now if you're using the series chart, you can extract the components with

    chart
      .seriesAccessor(({key}) => key[0])
      .keyAccessor(({key}) => key[1])
    

    You could use the third parameter of the series chart chart function to determine the color or dash style of the layer, e.g.:

    const dashStyles = {
      '0-18': [3,1],
      '19-29': [4,1,1,1],
      // ...
    };
    
    .chart(function(c, _, subkey) {
      return new dc.LineChart(c).dashStyle(dashStyles[subkey]);
    })
    

    Interaction

    dc.js does not natively support the kind of drill-down you are describing. It would be easier to have one chart which is by hair color and another chart which is by age. Then when no hair color is selected, the age chart will show all hair colors, and when no age is selected, the hair color chart will show all ages.

    If you want drill-down as you describe, you will have to write custom code to apply the filter and swap the chart definition when a hair color is clicked. It's not terribly complicated but please ask a follow-up question if you can't figure it out - it's better to keep SO questions on a single topic.

    Annotating with D3

    This part is pretty simple no matter how you implement the charts.

    You will implement a pretransition handler and use chart.selectAll to add the content you need. There are many examples here on SO, so I won't go into it here.

    Conclusion

    I hope this gets you started. I've answered your specific question and given some hints about other assumptions or implicit questions within your question. It will be some work to get the results you want, but it is definitely possible.