jsoncsvmiller

How can I clean up empty fields when converting CSV to JSON with Miller?


I have several CSV files of item data for a game I'm messing around with that I need to convert to JSON for consumption. The data can be quite irregular with several empty fields per record, which makes for sort of ugly JSON output.

Example with dummy values:

Id,Name,Value,Type,Properties/1,Properties/2,Properties/3,Properties/4
01:Foo:13,Foo,13,ACME,CanExplode,IsRocket,,
02:Bar:42,Bar,42,,IsRocket,,,
03:Baz:37,Baz,37,BlackMesa,CanExplode,IsAlive,IsHungry,

Converted output:

[
{
  "Id": "01:Foo:13",
  "Name": "Foo",
  "Value": 13,
  "Type": "ACME",
  "Properties": ["CanExplode", "IsRocket", ""]
},
{
  "Id": "02:Bar:42",
  "Name": "Bar",
  "Value": 42,
  "Type": "",
  "Properties": ["IsRocket", "", ""]
},
{
  "Id": "03:Baz:37",
  "Name": "Baz",
  "Value": 37,
  "Type": "BlackMesa",
  "Properties": ["CanExplode", "IsAlive", "IsHungry"]
}
]

So far I've been quite successful with using Miller. I've managed to remove completely empty columns from the CSV as well as aggregate the Properties/X columns into a single array.

But now I'd like to do two more things to improve the output format to make consuming the JSON easier:

Desired output:

[
{
  "Id": "01:Foo:13",
  "Name": "Foo",
  "Value": 13,
  "Type": "ACME",
  "Properties": ["CanExplode", "IsRocket"]
},
{
  "Id": "02:Bar:42",
  "Name": "Bar",
  "Value": 42,
  "Type": null,
  "Properties": ["IsRocket"]
},
{
  "Id": "03:Baz:37",
  "Name": "Baz",
  "Value": 37,
  "Type": "BlackMesa",
  "Properties": ["CanExplode", "IsAlive", "IsHungry"]
}
]

Is there a way to achieve that with Miller?

My current commands are:


Solution

  • It's not probably the way you want to do it. I use also jq.

    Running

    mlr --c2j  --jflatsep '/' --jlistwrap remove-empty-columns then cat input.csv | \
    jq '.[].Properties|=map(select(length > 0))' | \
    jq '.[].Type|=(if . == "" then null else . end)'
    

    you will have

    [
      {
        "Id": "01:Foo:13",
        "Name": "Foo",
        "Value": 13,
        "Type": "ACME",
        "Properties": [
          "CanExplode",
          "IsRocket"
        ]
      },
      {
        "Id": "02:Bar:42",
        "Name": "Bar",
        "Value": 42,
        "Type": null,
        "Properties": [
          "IsRocket"
        ]
      },
      {
        "Id": "03:Baz:37",
        "Name": "Baz",
        "Value": 37,
        "Type": "BlackMesa",
        "Properties": [
          "CanExplode",
          "IsAlive",
          "IsHungry"
        ]
      }
    ]