javascriptarraysaggregate

Customized data aggregation using vanilla JavaScript


Could you please help me with the following data aggregation problem?

Here's my initial array:

const data = [
    { date: '2023-11-01', type: 'A', casualties: 10, state: 'NY', country: 'USA', site: 'hash1' },
    { date: '2023-11-01', type: 'B', casualties: 5, state: 'NY', country: 'USA', site: 'hash2' },
    { date: '2023-11-02', type: 'A', casualties: 15, state: 'NY', country: 'USA', site: 'hash3' },
    { date: '2023-11-01', type: 'C', casualties: 20, state: 'NY', country: 'USA', site: 'hash4' },
    { date: '2023-11-02', type: 'B', casualties: 8, state: 'NY', country: 'USA', site: 'hash5' },
    { date: '2023-11-03', type: 'A', casualties: 25, state: 'NY', country: 'USA', site: 'hash6' },
    { date: '2023-11-01', type: 'D', casualties: 12, state: 'NY', country: 'USA', site: 'hash7' },
];

I would like to aggregate part of the rows under type other, depending on the casualties value.

If a given type has casualties equal or higher than a certain threshold for all dates, it should be part of the results array.

If a given type has casualties below the threshold for all dates but it is the only type below the threshold, it will be part of the results array.

If two or more types have casualties below a certain threshold for all dates, they should not be part of the results array. For this scenario, casualties will be summed for each date and the type value should be other.

The otherCasualties array should contain all the types that have casualties below the threshold for all dates.

Examples:

If the casualtiesThreshold is 9, the results array should be:

results = [
    { date: '2023-11-01', type: 'A', casualties: 10, state: 'NY', country: 'USA' },
    { date: '2023-11-01', type: 'B', casualties: 5, state: 'NY', country: 'USA' },
    { date: '2023-11-02', type: 'A', casualties: 15, state: 'NY', country: 'USA' },
    { date: '2023-11-01', type: 'C', casualties: 20, state: 'NY', country: 'USA' },
    { date: '2023-11-02', type: 'B', casualties: 8, state: 'NY', country: 'USA' },
    { date: '2023-11-03', type: 'A', casualties: 25, state: 'NY', country: 'USA' },
    { date: '2023-11-01', type: 'D', casualties: 12, state: 'NY', country: 'USA' },
];

and the otherCasualties array should be [].

If the casualtiesThreshold is 13, the results array should be:

results = [
    { date: '2023-11-01', type: 'A', casualties: 10, state: 'NY', country: 'USA'},
    { date: '2023-11-01', type: 'other', casualties: 17, state: 'NY', country: 'USA' },
    { date: '2023-11-02', type: 'A', casualties: 15, state: 'NY', country: 'USA' },
    { date: '2023-11-01', type: 'C', casualties: 20, state: 'NY', country: 'USA' },
    { date: '2023-11-02', type: 'other', casualties: 8, state: 'NY', country: 'USA' },
    { date: '2023-11-03', type: 'A', casualties: 25, state: 'NY', country: 'USA' },
];

and the otherCasualties array should be ['B', 'D'].

This is what I tried:

function processCasualties(data, casualtiesThreshold) {
  const results = [];
  const otherCasualties = [];

  const groupedData = data.reduce((acc, item) => {
    const key = item.type;
    if (!acc[key]) {
      acc[key] = [];
    }
    acc[key].push(item);
    return acc;
  }, {});

  const types = Object.keys(groupedData);

  types.forEach(type => {
    const typeData = groupedData[type];
    const totalCasualties = typeData.reduce((acc, item) => acc + item.casualties, 0);
    const allDatesBelowThreshold = typeData.every(item => item.casualties < casualtiesThreshold);

    if (totalCasualties >= casualtiesThreshold || (allDatesBelowThreshold && types.length === 1)) {
      results.push(...typeData.map(item => ({
        date: item.date,
        type: item.type,
        casualties: item.casualties,
        state: item.state,
        country: item.country,
      })));
    } else if (allDatesBelowThreshold) {
      otherCasualties.push(type);
    } else {
      results.push(...typeData.map(item => ({
        date: item.date,
        type: 'other',
        casualties: totalCasualties,
        state: item.state,
        country: item.country,
      })));
    }
  });

  return { results, otherCasualties };
}

const casualtiesThreshold = 13;

const { results, otherCasualties } = processCasualties(data, casualtiesThreshold);

console.log(results);
console.log(otherCasualties);


Solution

  • You could take two sets and collect in a first iteration types who match and other type. Then delete the types who match in the other set, because this types should be in the result set.

    Finally check, if only one type is in the other set, then add this type to the result as well.

    For all other non matching types take 'other' as type, grouped by date.

    const
        fn = (data, threshold) => {
            const
                result = [],
                other = new Set,
                match = new Set,
                others = {};
            
            for (const { type, casualties } of data) [other, match][+(casualties >= threshold)].add(type);
            
            match.forEach(Set.prototype.delete, other);
            
            if (other.size === 1) {
                match.add(...other);
                other.clear();
            }
            
            for (const { date, type, casualties, state, country } of data) {
                if (match.has(type)) {
                    result.push({ date, type, casualties, state, country });
                    continue;
                }
    
                if (others[date]) others[date].casualties += casualties;
                else result.push(others[date] = { date, type: 'other', casualties, state, country });
            }
    
            return { result, otherCasualties: [...other] };
        },
        data = [{ date: '2023-11-01', type: 'A', casualties: 10, state: 'NY', country: 'USA', site: 'hash1' }, { date: '2023-11-01', type: 'B', casualties: 5, state: 'NY', country: 'USA', site: 'hash2' }, { date: '2023-11-02', type: 'A', casualties: 15, state: 'NY', country: 'USA', site: 'hash3' }, { date: '2023-11-01', type: 'C', casualties: 20, state: 'NY', country: 'USA', site: 'hash4' }, { date: '2023-11-02', type: 'B', casualties: 8, state: 'NY', country: 'USA', site: 'hash5' }, { date: '2023-11-03', type: 'A', casualties: 25, state: 'NY', country: 'USA', site: 'hash6' }, { date: '2023-11-01', type: 'D', casualties: 12, state: 'NY', country: 'USA', site: 'hash7' }];
        
    console.log(fn(data, 9));
    console.log(fn(data, 13));
    .as-console-wrapper { max-height: 100% !important; top: 0; }