jsonjq

jq group by both outer and inner value within arrays


My data is in the following simplified format

[
  {
    "a": "foo",
    "b": [
      {
        "x": 1,
        "y": true
      }
    ]
  },
  {
    "a": "foo",
    "b": [
      {
        "x": 1,
        "y": true
      },
      {
        "x": 99,
        "y": false
      }
    ]
  },
  {
    "a": "bar",
    "b": []
  }
]

I am trying to get the count of all entries in b for each unique a. I tried to first group by a with jq '. | group_by(.a)[] which at least gets me the "unique a" part. However, I can't figure out howt o get the count of all b entries within the group_by result. Simply grouping by b like jq '. | group_by(.a)[] | group_by(.b)[]' doesn't work.

Alternatively, I tried jq -n 'jq '[ .[] | {a, n:(.b|length)} ] | group_by(.a)' but I'm still stuck on how to count across entries.

Any suggestions? The ultimate answer for this example would be 3 b entries for foo and 0 b entries for bar


Solution

  • Here's a reduce-based approach that iterates over all items while adding up the corresponding array lengths:

    reduce .[] as {$a,$b} ({}; .[$a] += ($b | length))
    
    {
      "foo": 3,
      "bar": 0
    }