node.jsjsonmoleculeroboe.jsmoleculer-web

What is the best approach to stream JSON from a REST API to an Express app?


I have a moleculer-based microservice that has an endpoint which outputs a large JSON object (around tens of thousands of objects)

This is a structured JSON object and I know beforehand what it is going to look like.

[ // ... tens of thousands of these
  {
    "fileSize": 1155624,
    "name": "Gyo v1-001.jpg",
    "path": "./userdata/expanded/Gyo v01 (2003)"
  },
  {
    "fileSize": 308145,
    "name": "Gyo v1-002.jpg",
    "path": "./userdata/expanded/Gyo v01 (2003) (Digital)"
  }
  // ... tens of thousands of these
]

I went about researching on JSON streaming, and made some headway there, in that I know how to consume a NodeJS ReadableStream client-side. I know I can use oboe to parse the JSON stream.

To that end, this is code in my Express-based app.


router.route("/getComicCovers").post(async (req: Request, res: Response) => {
  typeof req.body.extractionOptions === "object"
    ? req.body.extractionOptions
    : {};
  oboe({
    url: "http://localhost:3000/api/import/getComicCovers",
    method: "POST",
    body: {
      extractionOptions: req.body.extractionOptions,
      walkedFolders: req.body.walkedFolders,
    },
  }).on("node", ".*", (data) => {
    console.log(data);
    res.write(JSON.stringify(data));
  });
});

This is the endpoint in moleculer

getComicCovers: {
    rest: "POST /getComicCovers",
    params: {
        extractionOptions: "object",
        walkedFolders: "array",
    },
    async handler(
        ctx: Context < {
            extractionOptions: IExtractionOptions;
            walkedFolders: IFolderData[];
        } >
    ) {
        
        const comicBooksForImport = await getCovers(
            ctx.params.extractionOptions,
            ctx.params.walkedFolders
        );

// comicBooksForImport is the aforementioned array of objects.
// How do I stream it from here to the Express app object-by-object?

        
    },
},

My question is: How do I stream this gigantic JSON from the REST endpoint to the Express app so I can parse it on the client end?

UPDATE

I went with a socket.io implementation per @JuanCaicedo's suggestion. I have it setup on both the server and the client end.

However, I do have trouble with this piece of code

map(
    walkedFolders,
    async (folder, idx) => {
        let foo = await extractArchive(
            extractionOptions,
            folder
        );

        let fo =
            new JsonStreamStringify({
                foo,
            });

        fo.pipe(res);
        if (
            +idx ===
            walkedFolders.length - 1
        ) {
            res.end();
        }
    }
);

I get a Error [ERR_STREAM_WRITE_AFTER_END]: write after end error. I understand that this happens because the response is terminated before the next iteration attempts to pipe the updated value of foo (which is a stream) into the response.

How do I get around this?


Solution

  • Are you asking for a general approach recommendation, or for support with the particular solution you have?

    If it's for the first, then I think your best bet for communicating between the server and the client is through websockets, perhaps with something like Socket.io. A long lived connection will serve you well here, since it will take a long time to transmit all your data across.

    Then you can send data from the server to the client any time you like. At that point you can read your data on the server as a node.js stream and emit the data one at a time.

    The problem with using Oboe and writing to the response on every node is that it requires a long running response, and there's a high likelihood the connection could get interrupted before you've sent all the data across.