jsonpowershellcsvflattenobject-graph

Convert nested JSON array into separate columns in CSV file


I have a JSON file that looks like this:

{
    "id": 10011,
    "title": "Test procedure",
    "slug": "slug",
    "url": "http://test.test",
    "email": "test@test.com",
    "link": "http://test.er",
    "subject": "testing",
    "level": 1,
    "disciplines": [
      "discipline_a",
      "discipline_b",
      "discipline_c"
    ],
    "areas": [
      "area_a",
      "area_b"
    ]
  },

I was trying to use the following command to convert that into the CSV file:

(Get-Content "PATH_TO\test.json" -Raw | ConvertFrom-Json)| Convertto-CSV -NoTypeInformation | Set-Content "PATH_TO\test.csv"

However, for disciplines and areas I am getting System.Object[] in the resulting CSV file.

Is there a way to put all those nested values as a separate columns in CSV file like area_1, area_2 etc. And the same for disciplines.


Solution

  • The CSV conversion/export cmdlets have no way of "flattening" an object, and I may be missing something, but I know of no way to do this with a built-in cmdlet or feature. If you can guarantee that disciplines and areas will always have the same number of elements, you can trivialize it by using Select-Object with derived properties to do this:

    $properties=@('id','title','slug','url','email','link','subject','level',
        @{Name='discipline_1';Expression={$_.disciplines[0]}}
        @{Name='discipline_2';Expression={$_.disciplines[1]}}
        @{Name='discipline_3';Expression={$_.disciplines[2]}}
        @{Name='area_1';Expression={$_.areas[0]}}
        @{Name='area_2';Expression={$_.areas[1]}}
    )
    (Get-Content 'PATH_TO\test.json' -Raw | ConvertFrom-Json)| Select-Object -Property $properties | Export-CSV -NoTypeInformation -Path 'PATH_TO\test.csv'
    

    However, I am assuming that disciplines and areas will be variable length for each record. In that case, you will have to loop over the input and pull the highest count value for both disciplines and areas, then build the properties array dynamically:

    $inputData = Get-Content 'PATH_TO\test.json' -Raw | ConvertFrom-Json
    $counts = $inputData | Select-Object -Property     @{Name='disciplineCount';Expression={$_.disciplines.Count}},@{Name='areaCount';Expression={$_.areas.count}}
    $maxDisciplines = $counts | Measure-Object -Maximum -Property disciplineCount | Select-Object -ExpandProperty     Maximum
    $maxAreas = $counts | Measure-Object -Maximum -Property areaCount | Select-Object -ExpandProperty Maximum
    
    $properties=@('id','title','slug','url','email','link','subject','level')
    
    1..$maxDisciplines | % {
      $properties += @{Name="discipline_$_";Expression=[scriptblock]::create("`$_.disciplines[$($_ - 1)]")}
    }
    
    1..$maxAreas | % {
      $properties += @{Name="area_$_";Expression=[scriptblock]::create("`$_.areas[$($_ - 1)]")}
    }
    
    $inputData | Select-Object -Property $properties | Export-CSV -NoTypeInformation -Path 'PATH_TO\test.csv'
    

    This code hasn't been fully tested, so it may need some tweaking to work 100%, but I believe the ideas are solid =)