jq

Using jq to merge arrays inside merged objects


There are multiple questions on this topic already, and they all look closely related:

So there are related examples, it's not clear how to modify the reduce statement from those examples to do what I want.

Problem: We've got a file containing multiple JSON object-blobs. Each top-level object has a single key, with an array as its value. Essentially it's

{
    "SomeCategory": [
        {
            "Key": "value1"    # very first entry
        },
        {
            "Key": "value2"
        },
        { ... },
        {
            "Key": "valueA"
        }
    ]
}
{
    "SomeCategory": [
        {
            "Key": "valueA+1"
        },
        {
            "Key": "valueA+2"
        },
        { ... },
        {
            "Key": "valueA+B"    # very last entry
        }
    ]
}
{
    ... repeat ad nauseam with different values ...
}

Goal: Here's what I'm hoping to end up with:

{
    "SomeCategory": [
        {
            "Key": "value1"    # very first entry
        },
        {
            "Key": "value2"
        },
        { ... },
        {
            "Key": "valueA+B"    # very last entry
        }
    ]
}

That is, all the top-level blobs have been merged into a single top-level blob, and the arrays contained in the blobs' single key are all merged.

Attempts: The linked examples all recommend things along the lines of

jq -n 'reduce inputs as $foo ({}; . *= $foo)'

which runs into the problem of "if that key refers to a scalar or array, then the later objects in the input will overwrite the value":

{
    "SomeCategory": [
        {
            "Key": "valueY+1"    
        },
        {
            "Key": "valueY+2"
        },
        { ... },
        {
            "Key": "valueY+Z"    # very last entry
        }
    ]
}

i.e., only the last top-level blob survives.

Other things that might matter:

I'm guessing that some nested merge in the "UPDATE" clause in reduce EXP as $var (INIT; UPDATE) will do this, but I cannot figure out from the jq man page what that syntax is supposed to look like (all its examples are extremely contrived and simplistic). The instances of reduce that google can find don't use any nested update pipelines, so perhaps it's not even supported and that idea was dumb. Normally the right answer is some form of "pipelined expressions in a single jq invocation" but we haven't figure out how to avoid the replacing of the array value on each update.


Solution

  • You could use to_entries to destructure each input into its object key and the array value, then use regular addition:

    jq 'reduce (inputs | to_entries)[] as {$key, $value} (.; .[$key] += $value)'
    

    Demo

    Or, with just a single key, use the first item of the keys array:

    jq 'reduce inputs as $input (.; .[$input | keys[0]] += $input[])'
    

    Demo

    Output:

    {
      "SomeCategory": [
        {
          "Key": "value1"
        },
        {
          "Key": "value2"
        },
        {
          "Key": "valueA"
        },
        {
          "Key": "valueA+1"
        },
        {
          "Key": "valueA+2"
        },
        {
          "Key": "valueA+B"
        }
      ]
    }