jsonjq

Remove Duplicate Array Values


I am using the following jq query to extract the AWS ARN and associated protocols. However I only need the ARN to be listed once followed by the ports and protocols

my code is jq -r '.Listeners[] | (.LoadBalancerArn), (.Protocol)' and the results are

"arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde"
"HTTP"
"arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde"
"HTTP"
"arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde"
"HTTPS"

I have tried everything including unique, first, unique_by, select, contains, etc.. and the results are always "Cannot iterate over string" or number

Desired results

"arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde"
"HTTP"
"HTTP"
"HTTPS"

Sample JSON

{
    "Listeners": [
        {        
            "LoadBalancerArn": "arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde",
            "Port": 9090,
            "Protocol": "HTTP"
        },
        {        
            "LoadBalancerArn": "arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde",
            "Port": 80,
            "Protocol": "HTTP"
            },
        {       
            "LoadBalancerArn": "arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde",
            "Port": 443,
            "Protocol": "HTTPS"
        }
    ]
}

Solution

  • Group by the common field and iterate over the groups, then output the common field of the first (which is the same for the whole group), and iterate again to output other fields from the same group:

    jq -r '.Listeners | group_by(.LoadBalancerArn)[]
      | .[0].LoadBalancerArn, .[].Protocol'
    
    arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde
    HTTP
    HTTP
    HTTPS
    

    Demo