dc.jscrossfilter

Getting Top n Of Complex Reduce


I think I'm missing something obvious. Using DC.JS's datatable implementation:

  var salesExpenseByCompany = dc.dataTable("#salesExpenseByCompany");

  var salesExpenseDim = facts.dimension(function (d) {
    return d.client;
  });

  function reduceAdd(i, d) {
    i.sales = i.sales + d.sales;
    i.expenses = i.expenses + d.expenses;
    return i;
  }
  function reduceRemove(i, d) {
    i.sales = i.sales - d.sales;
    i.expenses = i.expenses - d.expenses;
    return i;
  }

  function reduceInitial(i, d) {
    return {
      sales: 0,
      expenses: 0,
    };
  }

This, of course, returns a new object, summing up each property in the data.

 [
    {'key':'B','value':{'sales':1544,'expenses':478}},
    {'key':'C','value':{'sales':2781,'expenses':1354}},
    {'key':'D','value':{'sales':1196,'expenses':987}},
    {'key':'E','value':{'sales':1156,'expenses':622}},
    {'key':'F','value':{'sales':1778,'expenses':1208}},
    {'key':'G','value':{'sales':666,'expenses':55}},
    {'key':'A','value':{'sales':2318,'expenses':801}}
]

When I load it into the Datatable, i want the top 10 or 25 or 50 sales(set via buttons), but I'm not sure how to get that. I want those to be sorted descending. Here's what I have so far:

    .dimension(salesExpenseGr)
.size(4)
    .order(d3.descending)
    .sortBy(function (d) {
      return d.value.sales;
    })
    .columns([
      {
        label: "Company Name",
        format: function (d) {
          return d.key;
        },
      },
      {
        label: "Sales",
        className: "text-right",
        format: function (d) {
          return numFormat(d.value.sales);
        },
      },
      {
        label: "Expenses",
        className: "text-right",
        format: function (d) {
          return numFormat(d.value.expenses);
        },
      },
    ]);

The .size() option seems to first take the first four of the dimension and then sorts those 4 by the proper column.

With .size(4)

enter image description here

Sans .size()

enter image description here

Company F should be in the table when the size is limited to 4.

So what am I missing? I suspect this has to do with the .top() option, but I'm not sure where to put it.

https://codepen.io/jlbmagic/pen/WNxGWXZ

See the attached Codepen.

THANKS!


Solution

  • The group ordering should usually agree with the sorting of the data table:

      var salesExpenseGr = salesExpenseDim
        // ...
        .order(({sales}) => sales);
    

    I think if you used a simple reduction, the group would be sorted as you expect. But crossfilter doesn't know how to sort objects, so sorting doesn't happen and the bins are left in alphabetic order by key.

    If the sorts agree, F is included instead of D:

    table with F

    (The value of F differs between your pen and your screenshots.)

    Fork of your codepen.

    historic note

    It would be reasonable to ask why dc.js has footguns like this one. It has to do with the evolution of the library. Early on, dc.js tried to use functionality from crossfilter whenever possible, so .size() uses dimension.top() (which is group.top() here). But later on, people found they wanted to sort data in ways not possible using just crossfilter objects, so .sortBy() was added to the data table (and .ordering() for other charts).

    Capped charts such as the row and pie charts have stopped using .top() for this reason, but the data table has never been cleaned up.