I have a JSON file with several sub-levels that I need to parse with PDI. The tricky bit that is throwing me for a loop is the one to many relationship in the nested array.
Here is a sample of my JSON:
{
"@gdata.count": "139111",
"value": [
{
"InvStatus": {
"Description": "Active"
},
"DeviceAddresses": [],
"People": [],
"Id": "11",
"InvHostName": "NYCRMR-TWX9000",
"InvDomain": "blah.masked.com",
"InvSerialNumber": "86753091",
"InvDevType": "Workstation",
"InvRegion": "NA",
"InvLocation": "114AP2",
"InvNetwork": "WMG",
"InvBrand": "RMR",
"InvProtectionStatus": null,
"InvConversionStatus": null,
"InvDeviceDob": "2023-10-23"
}, {
"InvStatus": {
"Description": "Disconnected"
},
"DeviceAddresses": [],
"People": [],
"Id": "37",
"InvHostName": "NYCRMR-TWX9002",
"InvDomain": "blah.masked.com",
"InvSerialNumber": "86753092",
"InvDevType": "Workstation",
"InvRegion": "NA",
"InvLocation": "114AP4",
"InvNetwork": "WMG",
"InvBrand": "RMR",
"InvProtectionStatus": null,
"InvConversionStatus": null,
"InvDeviceDob": "2023-10-23"
}, {
"InvStatus": {
"Description": "Disconnected"
},
"DeviceAddresses": [],
"People": [{
"Role": {
"Description": "Asset Manager"
},
"Id": "65571",
"Email": "amy.sample@example.com"
}, {
"Role": {
"Description": "Primary Technical Contact"
},
"Id": "65477",
"Email": "barry.sample@example.com"
}, {
"Role": {
"Description": "Asset User"
},
"Id": "65478",
"Email": "cass.sample@example.com"
}, {
"Role": {
"Description": "Secondary Technical Contact"
},
"Id": "65479",
"Email": "david.sample@example.com"
}, {
"Role": {
"Description": "Secondary Technical Contact"
},
"Id": "65475",
"Email": "eric.sample@example.com"
}
],
"Id": "44",
"InvHostName": "-0cc05ac548317d30c",
"InvDomain": "WORKGROUP",
"InvSerialNumber": "n/a",
"InvDevType": "Server",
"InvRegion": "NA",
"InvLocation": "FREM45",
"InvNetwork": "WMB",
"InvBrand": "OKS",
"InvProtectionStatus": null,
"InvConversionStatus": null,
"InvDeviceDob": "2021-05-03"
}
]
}
I have attempted doing it all in one step and that only returned to me the first item in the nested results.
Next I attempted to do it in multiple steps where the first step retrieved the first level of fields, the 2nd the second level, and so on.
My first step I am retrieving these fields: PDI Screenshot 1
My second step I am retrieving these fields: PDI Screenshot 2
What ends up happening is the People fields from the 2nd step duplicate to all of the rows retrieved in the 1st step rather than just causing a duplication to the 3rd record from the 1st step.
Above JSON is just a small sample, my actual records that I'll be pulling back is over 100k and I only want to have the People records associated with the Devices.
I was able to figure this out and wanted to let anyone else looking for the solution know how I solved it.
The second methodology was the correct path, where I would parse out the first level of fields in the first step and then in the second step I would parse out the second level of fields.
The added twist I needed to do was pull out the first level's array of the secondary array into an additional field, in my example I called it PeopleData. This put just that small JSON array into its own field that was directly tied to that row's data and not every other row.
See screenshot:
Then in my next step I simply referenced that new field JSON and parsed that data out using [*] and because this was unique to that row only it kept the relationships to the top level intact.
See screenshot:
I hope this helps someone.