jsonjqimport-from-csv

Import JSON from CSV, grouping by multiple fields


I would like to create a JSON with array of nested objects with a grouping for different fields. This is the CSV and Iwould like to group it by sid, year and quarter (first three fields):

S4446B3,2020,202001,2,345.45
S4446B3,2020,202001,4,24.44
S4446B3,2021,202102,5,314.55
S6506LK,2020,202002,3,376.55
S6506LK,2020,202003,3,76.23

After splitting the CSV with the following I get an object for each record.

split("\n") 
   | map(split(",")) 
   | .[0:] 
   | map({"sid" : .[0], "year" : .[1], "quarter" : .[2], "customer_type" : .[3], "obj" : .[4]})

But for each sid I would like to get an array of objects nested like this :

[
    {
        "sid" : "S4446B3",
        "years" : [
            {
                "year" : 2020,
                "quarters" : [
                    {
                        "quarter" : 202001,
                        "customer_type" : [
                            {
                                "type" : 2,
                                "obj" : "345.45"
                            },
                            {
                                "type" : 4,
                                "obj" : "24.44"
                            }
                        ]
                    }
                ]
            },
            {
                "year" : 2021,
                "quarters" : [
                    {
                        "quarter" : 202102,
                        "customer_type" : [
                            {
                                "type" : 5,
                                "obj" : "314.55"
                            }
                        ]
                    }
                ]
            }
        ]
    },
    {
        "sid" : "S6506LK",
        "years" : [
            {
                "year" : 2020,
                "quarters" : [
                    {
                        "quarter" : 202002,
                        "customer_type" : [
                            {
                                "type" : 3,
                                "obj" : "376.55"
                            }
                        ]
                    },
                    {
                        "quarter" : 202003,
                        "customer_type" : [
                            {
                                "type" : 3,
                                "obj" : "76.23"
                            }
                        ]
                    }
                ]
            }
        ]
    }
]

Solution

  • It'd be more intuitive if sid, year, quarter, etc. were to be key names. With -R/--raw-input and -n/--null-input options on the command line, this will do that:

    reduce (inputs / ",")
      as [$sid, $year, $quarter, $type, $obj]
    (.; .[$sid][$year][$quarter] += [{$type, $obj}])
    

    And, to get your expected output you can append these lines to the above program.

    | .[][] |= (to_entries | map({quarter: .key, customer_type: .value}))
    | .[]   |= (to_entries | map({year:    .key, quarters:      .value}))
    | .     |= (to_entries | map({sid:     .key, years:         .value}))