node.jsencodingutf-8latin1

Converting a string from utf8 to latin1 in NodeJS


I'm using a Latin1 encoded DB and can't change it to UTF-8 meaning that I run into issues with certain application data. I'm using Tesseract to OCR a document (tesseract encodes in UTF-8) and tried to use iconv-lite; however, it creates a buffer and to convert that buffer into a string. But again, buffer to string conversion does not allow "latin1" encoding.

I've read a bunch of questions/answers; however, all I get is setting client encoding and stuff like that.

Any ideas?


Solution

  • Since Node.js v7.1.0, you can use the transcode function from the buffer module:
    https://nodejs.org/api/buffer.html#buffer_buffer_transcode_source_fromenc_toenc

    For example:

    const buffer = require('buffer');
    const latin1Buffer = buffer.transcode(Buffer.from(utf8String), "utf8", "latin1");
    const latin1String = latin1Buffer.toString("latin1");