jsonlinuxcsvjq

Flatten multi-level JSON to CSV output


I am working on a jq utility on Linux to process JSON file and convert it into a CSV format. I have the following JSON with multi-level hierarchy which I want to flatten into CSV:

{
  "totalJobCount": 13,
  "jobActivityLog": [
    {
      "taskId": "gNngdUm4p0WfBDKHJqQ2S8",
      "jobs": [
        {
          "runId": 63750,
          "startedBy": "svc-ctmagent",
          "startTime": "2025-05-04T19:03:48Z",
          "endTime": "2025-05-04T19:03:50Z",
          "status": "SUCCESS",
          "logLocation": "26.1.1/../data/taskLogs/2025-05-04/1000007899913.log",
          "messageText": "Job completed normally",
          "successFiles": 1,
          "failedFiles": 0,
          "fileDetails": [
            {
              "path": "/in/finance/ALL_GL_AHCS/2025/05/04/XlaTransaction_ALL_20250504T1902314697.zip",
              "size": 17172,
              "lastModified": "2025-05-04T19:02:32Z",
              "startTime": "2025-05-04T19:03:49Z",
              "status": "SUCCESS",
              "duration": 390,
              "transferDirection": "UPLOAD"
            }
          ]
        }
        ]
    }
    ]
}

I would like to see the CSV in the following output:

taskId, jobs.runId, jobs.startTime, jobs.endTime, jobs.status, jobs.fileDetails.path, jobs.fileDetails.transferDirection

How is it possible?


Solution

  • Descend into the respective parts of the document, and create an array of values, which can then be passed to @csv. Use the -r flag to strip the JSON encoding.

    .jobActivityLog[] | . as {$taskId} | .jobs[] | .fileDetails[] as $fd | [
      $taskId, .runId, .startTime, .endTime, .status,
      $fd.path, $fd.transferDirection
    ] | @csv
    
    "gNngdUm4p0WfBDKHJqQ2S8",63750,"2025-05-04T19:03:48Z","2025-05-04T19:03:50Z","SUCCESS","/in/finance/ALL_GL_AHCS/2025/05/04/XlaTransaction_ALL_20250504T1902314697.zip","UPLOAD"
    

    Demo