jqjsonlinesndjson

Need to modify an existing JQ filter of a GitHub CLI GraphQL response


I am creating a GitHub Action workflow which will call a GitHub CLI API request using GraphQL. This gh api graphql response is --paginate and returns JSON Lines (ndjson).

I created the GraphQL and jq queries, and I am close to the desired output; however, my jq query needs to be modified and I can't figure out what to change.

First, here is the desired output format I want to achieve. Notice the single object that holds all the key-value lineage information.

[
  {
    KEY: VALUE,
    KEY: VALUE,
    ...
  }
]

And here is the actual format of the output that I am getting. Notice that every single key-value information is wrapped within its own object.

[
  {
    KEY: VALUE,
  },
  {
    KEY: VALUE,
  },
  ...
]

Here is my current jq query filter along with a snippet of the GraphQL response in jq play. It contains a snippet of 2 JSON Lines (jsonl, ndjson) entries (pretty printed). Search for data to see each individual response.

I need to --slurp/-s my jq query due to the paginated results.

I want to only include milestones which:

Also, if the milestone title contains either or , , then I need to split the title. Each split will be its own key with identical values.

Here is my jq query that needs to be modified:

.[] | .data.repository as {
    nameWithOwner: $name, 
    milestones: { 
        nodes: $milestones
    }
}
| [
    foreach $milestones[] as $milestone (
        null; $milestone ; 
        $milestone
        | select($milestone.progressPercentage == 100)
        | select($milestone.title | contains("withdrawn") | not)
        | select($milestone.issues.nodes[])
        |
        {
            (($milestone.title | gsub(", "; " ") | split(" "))[]) : 
            [
                foreach $milestone.issues.nodes[] as $issue (
                    {}; . + { $issue };
                    $issue as $issue | $issue
                    | (reduce $issue.labels.nodes[] as $item ([]; . + [$item.name])) as $labels
                    |
                    {
                        repo: $name,
                        issue: $issue.number,
                        milestone: $milestone.number,
                        labels: $labels
                    }
                    
                )
            ]
        }
    )
]
| .

Here is a small JSON snippet which needs to be filtered by jq. It has 2 milestones but will output 3 key-value pairs (keys: C.1, EXAMPLE_SPLIT, and B.1.429):

{
  "data": {
    "repository": {
      "nameWithOwner": "cov-lineages/pango-designation",
      "milestones": {
        "pageInfo": {
          "hasNextPage": true,
          "endCursor": "Y3Vyc29yOnYyOpHOAGviZA=="
        },
        "nodes": [
          {
            "number": 1,
            "title": "C.1, EXAMPLE_SPLIT",
            "progressPercentage": 100,
            "issues": {
              "nodes": [
                {
                  "number": 2,
                  "labels": {
                    "nodes": [
                      {
                        "name": "proposed"
                      },
                      {
                        "name": "designated"
                      }
                    ]
                  }
                }
              ]
            }
          },
          {
            "number": 2,
            "title": "B.1.429",
            "progressPercentage": 100,
            "issues": {
              "nodes": [
                {
                  "number": 3,
                  "labels": {
                    "nodes": [
                      {
                        "name": "proposed"
                      },
                      {
                        "name": "designated"
                      }
                    ]
                  }
                }
              ]
            }
          }
        ]
      }
    }
  }
}

Solution

  • Something like this?

    .data.repository
    | .nameWithOwner as $repo
    | .milestones.nodes
    | map( # create a new array containing the milestones
    select(.progressPercentage == 100) | select(.title | contains("withdrawn") | not) | select(.issues.nodes | length > 0) # filter interesting milestone nodes
      | {
        (.title | splits(",? ")): [ # one object per title part, each object containing an array
            { $repo, milestone: .number } # base milestone data, plus …
            + (.issues.nodes[] | {
              issue: .number,
              labels: [.labels.nodes[].name] # collect all label names in an array
            })
        ]
      })
    | add # merge all objects of the array into a single object
    

    Might not be the most efficient solution compared to a reduce-based approach (creates intermediate arrays), but can be easily followed and divided into "logical" parts.

    Run with plain jq (no slurping, there's only a single top-level element)

    Output with the example data from the question:

    {
      "C.1": [
        {
          "repo": "cov-lineages/pango-designation",
          "milestone": 1,
          "issue": 2,
          "labels": [
            "proposed",
            "designated"
          ]
        }
      ],
      "EXAMPLE_SPLIT": [
        {
          "repo": "cov-lineages/pango-designation",
          "milestone": 1,
          "issue": 2,
          "labels": [
            "proposed",
            "designated"
          ]
        }
      ],
      "B.1.429": [
        {
          "repo": "cov-lineages/pango-designation",
          "milestone": 2,
          "issue": 3,
          "labels": [
            "proposed",
            "designated"
          ]
        }
      ]
    }
    

    If your input contains multiple objects and you want the final output to be a single object, use -s (--slurp) in combination with map(…):

    jq -s 'map( # -s reads everything as one big array, `map` transforms the elements of this array
      .data.repository
      | .nameWithOwner as $repo
      | .milestones.nodes
      | map( # create a new array containing the milestones
        select(.progressPercentage == 100) | select(.title | contains("withdrawn") | not) | select(.issues.nodes | length > 0) # filter interesting milestone nodes
        | {
          (.title | splits(",? ")): [ # one object per title part containing an array
              { $repo, milestone: .number } # base milestone data, plus …
              + (.issues.nodes[] | {
                issue: .number,
                labels: [.labels.nodes[].name] # collect all label names in an array
              })
          ]
        }
      )
    )
    | add # merge all objects of the array into a single object
    '