jsonjq

Transform a tree-like json so that single "item" in a list is moved up (from leaf), empty items deleted - using jq


I have a json that is like a tree of files or folders like this:

{
    "item": [
        {
            "name": "objects",
            "description": "root f",
            "item": [
                {
                    "name": "external",
                    "item": [
                        {
                            "name": "buuu",
                            "keep1": {}
                        },
                        {
                            "name": "biii",
                            "keep1": {}
                        }
                    ],
                    "description": "Whatever comment."
                },
                {
                    "name": "afolder",
                    "item": [
                        {
                            "name": "methods",
                            "item": [
                                {
                                    "name": "blaaa",
                                    "keep1": {}
                                },
                                {
                                    "name": "empty",
                                    "description": "Whatever2.",
                                    "item": []
                                },
                                {
                                    "name": "partner1",
                                    "item": [
                                        {
                                            "name": "operand",
                                            "item": [
                                                {
                                                    "name": "whetever4",
                                                    "beep1": { },
                                                    "beep2": []
                                                }
                                            ]
                                        }
                                    ]
                                },
                                {
                                    "name": "empty2",
                                    "description": "Whatever3."
                                },
                                {
                                    "name": "partner2",
                                    "item": [
                                        {
                                            "name": "operandx",
                                            "beep2": []
                                        },
                                        {
                                            "name": "operand2",
                                            "item": [
                                                {
                                                    "name": "whatever3",
                                                    "beep1": { },
                                                    "beep2": []
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

The tree is formed by .item[] (optional) arrays and .name (optionally .description) fields. If there are more fields then the item is not deletable!

So there are items like

objects/   --> with "root f" comment
  external --> is an item that could be deleted because has no extra fields 
  external/buu and external/bii  --> Are non-deletable items, do not move them up to "objects" folder! As there are two items under "exernal"! You only move up single items in a folder!
objects/afolder
objects/afolder/methods
  empty and empty2 are to be deleted items (leaf, and only name, description or item fields are there, and the items is practically empty.
objects/afolder/methods/partner1
  operand  --> is a folder to delete as you will move the "operand/whetever4" under objects/afolder/methods/partner1, and this "operand" will be empty!
objects/afolder/methods/partner2
 operand2 --> must be deleted after the single leaf "whatever3" is moved under objects/afolder/methods/partner2 - next to "operandx" ideally

Therefore the jq filter must produce the reorganized json while

So the output must be like:

{
    "item": [
        {
            "name": "objects",
            "description": "root f",
            "item": [
                {
                    "name": "external",
                    "item": [
                        {
                            "name": "buuu",
                            "keep1": {}
                        },
                        {
                            "name": "biii",
                            "keep1": {}
                        }
                    ],
                    "description": "Whatever comment."
                },
                {
                    "name": "afolder",
                    "item": [
                        {
                            "name": "methods",
                            "item": [
                                {
                                    "name": "blaaa",
                                    "keep1": {}
                                },
                                {
                                    "name": "partner1",
                                    "item": [
                                        {
                                            "name": "whetever4",
                                            "beep1": { },
                                            "beep2": []
                                        }
                                    ]
                                },
                                {
                                    "name": "partner2",
                                    "item": [
                                        {
                                            "name": "operandx",
                                            "beep2": []
                                        },
                                        {
                                            "name": "whatever3",
                                            "beep1": { },
                                            "beep2": []
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

Solution

  • Condense bottom-up by first recursively descending into each subordinate .item array, and then deleting and moving items from an already condensed .item arrays while going back up.

    candidate(m) succeeds if the context is an item of interest, i.e. it has no unknown keys and is a parent of exactly m child items. A missing .item key evaluates to null which conveniently also has a length of 0. If the .item key is an array, condense is recursively applied to its items, from which all candidates with no child items are removed. Eventually, if the context is a candidate with exactly one child item, it is replaced with that child item.

    def candidate(m):
      select([del(.item, .name, .description), .item | length] == [0,m]);
    
    def condense:
      .item |= (arrays | map(condense) | . - map(candidate(0)))
      | candidate(1) = .item[0];
    
    condense
    
    {
      "name": "objects",
      "description": "root f",
      "item": [
        {
          "name": "external",
          "item": [
            {
              "name": "buuu",
              "keep1": {}
            },
            {
              "name": "biii",
              "keep1": {}
            }
          ],
          "description": "Whatever comment."
        },
        {
          "name": "methods",
          "item": [
            {
              "name": "blaaa",
              "keep1": {}
            },
            {
              "name": "whetever4",
              "beep1": {},
              "beep2": []
            },
            {
              "name": "partner2",
              "item": [
                {
                  "name": "operandx",
                  "beep2": []
                },
                {
                  "name": "whatever3",
                  "beep1": {},
                  "beep2": []
                }
              ]
            }
          ]
        }
      ]
    }
    

    Demo

    Note: In contrast to your expected output, this also condenses the paths / (the root), /objects/afolder, and /objects/afolder/methods/partner1, as they all have (per your rules) only one item and no other keys. To prevent this from happening to the root as a special case, start by processing the root's items directly, i.e. .item[] |= condense (Demo).