From the Shopify API, I receive a link to a large amount of JSONL. Using NodeJS, I need to read this data line-by-line, as loading it all at once would use lots of memory. When I hit the JSONL url from the web browser, it automatically downloads the JSONL file to my downloads folder.
Example of JSONL:
{"id":"gid:\/\/shopify\/Customer\/6478758936817","firstName":"Joe"}
{"id":"gid:\/\/shopify\/Order\/5044232028401","name":"#1001","createdAt":"2022-09-16T16:30:50Z","__parentId":"gid:\/\/shopify\/Customer\/6478758936817"}
{"id":"gid:\/\/shopify\/Order\/5044244480241","name":"#1003","createdAt":"2022-09-16T16:37:27Z","__parentId":"gid:\/\/shopify\/Customer\/6478758936817"}
{"id":"gid:\/\/shopify\/Order\/5057425703153","name":"#1006","createdAt":"2022-09-27T17:24:39Z","__parentId":"gid:\/\/shopify\/Customer\/6478758936817"}
{"id":"gid:\/\/shopify\/Customer\/6478771093745","firstName":"John"}
{"id":"gid:\/\/shopify\/Customer\/6478771126513","firstName":"Jane"}
I'm unsure how to process this data in NodeJS. Do I need to hit the url, download all of the data and store it in a temporary file, then process the data line-by-line? Or can I read the data line-by-line directly after hitting the url (via some sort of stream?) and process it without storing it in a temporary file on the server?
(The JSONL comes from https://storage.googleapis.com/ if that helps.)
Thanks.
using axios
you can set the response to be a stream
, and then using a buildin readline module, you can process your data line by line.
import axios from 'axios'
import { createInterface } from 'node:readline'
const response = await axios.get('https://raw.githubusercontent.com/zaibacu/thesaurus/master/en_thesaurus.jsonl', {
responseType: 'stream'
})
const rl = createInterface({
input: response.data
})
for await (const line of rl) {
// do something with the current line
const { word, synonyms } = JSON.parse(line)
console.log('word, synonyms: ', word, synonyms);
}
testing this there is barely any memory usage