bashawk

Print some values based on conditions from all json files


I have several json files in a folder and subfolders for which I want to print only 3 fields. I use a for loop over all json files, but here to make it simple, the input below represents 3 json files. For each one, I want to print the "Filename" and "Value" only when inside each file appears at least one "Dmo = Path". When a files doesn´t contain a "Dmo = Path" block, then only print the filename.

{
  "Filename": "File_213",
  "Date": "2024-4-30",
  "Blocks": [
    {
      "Dmo": "WW",
      "Value": "23",
      "String": "",
    },
    {
      "Dmo": "Path",
      "Value": "/Files/2024/abd",
      "String": "",
    },
    {
      "Dmo": "Path",
      "Value": "/Files/2024/Ndew",
      "String": "",
    }
  ]
}

{
  "Filename": "File_4",
  "Date": "2024-4-30",
  "Blocks": [
    {
      "Dmo": "WW",
      "Value": "45",
      "String": "",
    }
  ]
}

{
  "Filename": "File_43",
  "Date": "2024-4-30",
  "Blocks": [
    {
      "Dmo": "Path",
      "Value": "/Files/2023/Roi2",
      "String": "",
    }
    }
  ]
}

My current code and current output is like below

$ awk '/"Filename":/{fnm=$2}
    /Dmo/{dmo=$2}
        /Value/ {

        val=$2;

    if (dmo != "")
        print fnm,val
    else
        print fnm

    fnm=""; dmo="";val=""}' input

"File_213", "23",
 "/Files/2024/abd",
 "/Files/2024/Ndew",
"File_4", "45",
"File_43", "/Files/2023/Roi2",

My expected output is:

File_213, /Files/2024/abd
File_213, /Files/2024/Ndew
File_4
File_43, /Files/2023/Roi2

Solution

  • For structured data, like JSON, you'd be better off using tools that are aware of that structure, like the JSON parser jq. It can process multiple inputs in a single invocation, e.g. with this straight-forward approach:

    jq -r '
      if any(.Blocks[]; .Dmo == "Path")
      then .Filename + (.Blocks[] | ", " + select(.Dmo == "Path").Value)
      else .Filename end
    ' input
    

    This produces

    File_213, /Files/2024/abd
    File_213, /Files/2024/Ndew
    File_4
    File_43, /Files/2023/Roi2
    

    Demo

    As an alternative, one could deduplicate some of the code by transforming the .Blocks array into one with pre-generated result values, and then test for its emptiness using variables for reference:

    jq -r '
      [.Blocks[] | ", " + select(.Dmo == "Path").Value] as $a
      | .Filename + if $a == [] then "" else $a[] end
    ' input
    

    Demo

    Another one would be using the // operator to produce an alternative value that is triggered on missing inputs when iterating over that pre-generated array. (This relies on the apparent secondary constraint that all values of .Value are actually strings, i.e. not false or null.)

    jq -r '.Filename + (.Blocks | map(", " + select(.Dmo == "Path").Value)[] // "")'
    

    Demo