I have several json files in a folder and subfolders for which I want to print only 3 fields. I use a for loop over all json files, but here to make it simple, the input below represents 3 json files. For each one, I want to print the "Filename" and "Value" only when inside each file appears at least one "Dmo = Path". When a files doesn´t contain a "Dmo = Path" block, then only print the filename.
{
"Filename": "File_213",
"Date": "2024-4-30",
"Blocks": [
{
"Dmo": "WW",
"Value": "23",
"String": "",
},
{
"Dmo": "Path",
"Value": "/Files/2024/abd",
"String": "",
},
{
"Dmo": "Path",
"Value": "/Files/2024/Ndew",
"String": "",
}
]
}
{
"Filename": "File_4",
"Date": "2024-4-30",
"Blocks": [
{
"Dmo": "WW",
"Value": "45",
"String": "",
}
]
}
{
"Filename": "File_43",
"Date": "2024-4-30",
"Blocks": [
{
"Dmo": "Path",
"Value": "/Files/2023/Roi2",
"String": "",
}
}
]
}
My current code and current output is like below
$ awk '/"Filename":/{fnm=$2}
/Dmo/{dmo=$2}
/Value/ {
val=$2;
if (dmo != "")
print fnm,val
else
print fnm
fnm=""; dmo="";val=""}' input
"File_213", "23",
"/Files/2024/abd",
"/Files/2024/Ndew",
"File_4", "45",
"File_43", "/Files/2023/Roi2",
My expected output is:
File_213, /Files/2024/abd
File_213, /Files/2024/Ndew
File_4
File_43, /Files/2023/Roi2
For structured data, like JSON, you'd be better off using tools that are aware of that structure, like the JSON parser jq. It can process multiple inputs in a single invocation, e.g. with this straight-forward approach:
jq -r '
if any(.Blocks[]; .Dmo == "Path")
then .Filename + (.Blocks[] | ", " + select(.Dmo == "Path").Value)
else .Filename end
' input
.Filename
, .Blocks
, .Dmo
, and .Value
access the respective field values, and with []
attached, iterates over the values of that arrayany
produces true
if at least one of the values provided matches a given condition, while select
filters its input values according to a given condition-r
flag decodes the JSON string results into raw texts (essentially, stripping the double quotes)This produces
File_213, /Files/2024/abd
File_213, /Files/2024/Ndew
File_4
File_43, /Files/2023/Roi2
As an alternative, one could deduplicate some of the code by transforming the .Blocks
array into one with pre-generated result values, and then test for its emptiness using variables for reference:
jq -r '
[.Blocks[] | ", " + select(.Dmo == "Path").Value] as $a
| .Filename + if $a == [] then "" else $a[] end
' input
Another one would be using the //
operator to produce an alternative value that is triggered on missing inputs when iterating over that pre-generated array. (This relies on the apparent secondary constraint that all values of .Value
are actually strings, i.e. not false
or null
.)
jq -r '.Filename + (.Blocks | map(", " + select(.Dmo == "Path").Value)[] // "")'