node.jscharacter-encodingbuffernode.js-stream

How to detect encoding errors in a Node.js Buffer


I'm reading a file in Node.js, into a Buffer object, and I'm decoding the UTF-8 content of the Buffer using Buffer.toString('utf8'). If there are encoding errors, I want to report a failure.

The toString() method handles decoding errors by substituting an xFFFD character, which I can detect by searching the result. But xFFFD is a legal character in the input file, and I don't want to report an error if the xFFFD was present and correctly encoded in the input.

Is there any way I can distinguish a Buffer that contains a legitimately-encoded xFFFD character from one that contains an encoding error?


Solution

  • The solution proposed by @eol in a comment on the question appears to meet the requirements.