If you have an invoice header with several values (invoice #, date, location) and an unknown amount of invoice lines with several values (product, price, tax), is there a way to flatten this data to one row that extends in cases where the quantity of invoice lines varies by invoice?
Input Example-
{"InvoiceRecords": [{
"InvoiceDate": "8/9/2017 12:00:00 AM",
"InvoiceLocation": "002",
"InvoiceNumber": "2004085",
"InvoiceRecordHeaderDetails": [{
"InvNum": "2004085",
"Location": "002",
"InvDate": "8/9/2017 12:00:00 AM"
}],
"InvoiceRecordLineItemDetails": [{
"UniqueID": "3939934",
"InvNum": "2004085",
"LINEITEM": "1",
"CUSTID": "PREAA",
"DEPTID": "320306",
"PRODID": "088856",
"ProdDesc": "STATE UST",
"Unitprice": "0.003",
"QuantShare": "237.5",
"TaxRate": "7.25",
"taxamount": "0.05"
}],
"InvoiceTaxCodeDetails": [{
"InvNum": "2004085",
"LineItem": "1",
"UniqueID": "34",
"taxCode": "SALES TAX",
"taxrate": "7.25",
"maxtax": "0"
}]
}]}
I need all the items on the same row (allowing for there to be more than one line item and/or more than one Tax Code items on a given Invoice Record.
Output Example (note: "_n" below in reference to undetermined amount of invoice lines and tax rows possible):
{"InvoiceRecords": [{
"InvoiceDate": "8/9/2017 12:00:00 AM",
"InvoiceLocation": "002",
"InvoiceNumber": "2004085",
"InvoiceRecordHeaderDetailsInvNum": "2004085",
"InvoiceRecordHeaderDetailsInvNumLocation": "002",
"InvoiceRecordHeaderDetailsInvNumInvDate": "8/9/2017 12:00:00 AM",
"InvoiceRecordLineItemDetailsUniqueID_1": "3939934",
"InvoiceRecordLineItemDetailsInvNum_1": "2004085",
"InvoiceRecordLineItemDetailsLINEITEM_1": "1",
"InvoiceRecordLineItemDetailsCUSTID_1": "PREAA",
"InvoiceRecordLineItemDetailsDEPTID_1": "320306",
"InvoiceRecordLineItemDetailsPRODID_1": "088856",
"InvoiceRecordLineItemDetailsProdDesc_1": "STATE UST",
"InvoiceRecordLineItemDetailsUnitprice_1": "0.003",
"InvoiceRecordLineItemDetailsQuantShare_1": "237.5",
"InvoiceRecordLineItemDetailsTaxRate_1": "7.25",
"InvoiceRecordLineItemDetailstaxamount_1": "0.05",
"InvoiceTaxCodeDetailsInvNum_1": "2004085",
"InvoiceTaxCodeDetailsLineItem_1": "1",
"InvoiceTaxCodeDetailsUniqueID_1": "34",
"InvoiceTaxCodeDetailstaxCode_1": "SALES TAX",
"InvoiceTaxCodeDetailstaxrate_1": "7.25",
"InvoiceTaxCodeDetailsmaxtax_1": "0",
"InvoiceRecordLineItemDetailsUniqueID_n": "3939934",
"InvoiceRecordLineItemDetailsInvNum_n": "2004085",
"InvoiceRecordLineItemDetailsLINEITEM_n": "1",
"InvoiceRecordLineItemDetailsCUSTID_n": "PREAA",
"InvoiceRecordLineItemDetailsDEPTID_n": "320306",
"InvoiceRecordLineItemDetailsPRODID_n": "088856",
"InvoiceRecordLineItemDetailsProdDesc_n": "STATE UST",
"InvoiceRecordLineItemDetailsUnitprice_n": "0.003",
"InvoiceRecordLineItemDetailsQuantShare_n": "237.5",
"InvoiceRecordLineItemDetailsTaxRate_n": "7.25",
"InvoiceRecordLineItemDetailstaxamount_n": "0.05",
"InvoiceTaxCodeDetailsInvNum_n": "2004085",
"InvoiceTaxCodeDetailsLineItem_n": "1",
"InvoiceTaxCodeDetailsUniqueID_n": "34",
"InvoiceTaxCodeDetailstaxCode_n": "SALES TAX",
"InvoiceTaxCodeDetailstaxrate_n": "7.25",
"InvoiceTaxCodeDetailsmaxtax_n": "0"
}]}
Thanks!
You have an example of a similar question in the samples
directory which sits nearby you spoon.bat. Have a look at the samples/transformation/XML Add
and survive the first choc: they do something much more complex, just to show all what is possible.
In your case, split with a Switch/Case
, the input stream in header, items and manage to keep the InvoiceNumber on each (more on this later). Convert the three stream into JSON (with JSON Output or, maybe easier, with a Javascript
). Then you Group by
the items by InvoiceNumber. Join the three flows by InvoiceNumber, for which I suggest a lookup stream
in the header stream then an other lookup stream
in the footer stream. With an other javascript
and treating the data as string, you can build the JSON row in the format { header, [item], footer}, which you can Group by
with a concatenation to have only one row.
Some work, but rather standard, except for the tricky part of the get the InvoiceNumber on the items and footer as they have disappeared from the flow. For that you can use fact that the javascript
preserve the values unless redefined. Add a new start script [right click on the Script1 on the tab top, add a copy, right click on the Script1_0 just created, and define it as Start script].
On this start script:
var PrevInvoiceNumber = -1;
On the main script:
if(InvoiceNumber && PrevInvoiceNumber!=InvoiceNumber)
PrevInvoiceNumber = InvoiceNumber
Then you should see the data with on each line the PrevInvoiceNumber which is equal to the expected InvoiceNumber of the invoice.