jsontalendndjson

How to parse NDJSON


I'm working with Talend and I've got to parse NDJSON to other file formats. Hence, I have to get the NDJSON data but I'm having some hardships doing that. The NDJSON content is like:

{"text":"some test that should be parsed in operation down the line"}
{"text":"some other test that should be parsed in operation down the line"}

I was trying to do that with the tFileInputJSON component, trying to create a loop node on the root of the file like $[*] and then accessing text in the field definition, but nothing gets parsed.

How can I do this?


Solution

  • thre is two issues with what you are trying to do. The first is that the tFileInputJSON reads the file as a single JSON object, and NDJson is composed of multiples json objects delimited by a separator (\n) and also with the example you've given your JSONPath expression is wrong.

    So if you try with the tFileInputJSON :

    enter image description here

    It will only read the first object (the first row) :

    enter image description here

    So to read it correctly you can use components like tFileInputFullRow (or other components that use a line separator) and then use a tExtractJSONFields to parse the json objects :

    enter image description here

    enter image description here

    enter image description here

    Using that you can read the JSON within your NDJson file :

    enter image description here

    Hope this helps