I need to know the encoding of a node stream for which I am using detect-character-encoding module. But the problem is that I can only read encodings of a buffer and not a stream due to which I have to do something like this:
FileStream.on('data', (chunk) => {
console.log(chunk)
const charsetMatch = detectCharacterEncoding(chunk)
console.log(charsetMatch)
})
Knowing stream encoding comes at the cost of losing a chunk of data, which is required later in the code flow. Is there a way possible in which I can just peek at chunk know its encoding and not lose the chunk and data?
You can build a promise to return both the contents and the charset of the stream:
const charsetStream = (stream) => new Promise((resolve, reject) => {
const detectCharacterEncoding = require('detect-character-encoding');
let chunks = [];
stream.on('data', (chunk) => {
chunks.push(chunk);
})
stream.on('end', () => {
chunks = Buffer.concat(chunks);
resolve({
content: chunks,
charset: detectCharacterEncoding(chunks)
})
})
stream.on('error', (err) => {
reject(err);
})
});
charsetStream(FileStream)
.then(info => {
console.log('content', info.content);
console.log('charset', info.charset);
})
.catch(console.log);
// You can use the FileStream outside the method but you can use it once !
// this is completely different than the "stream" variable
FileStream.on('data', (chunk) => {
console.log('FileStream', chunk.toString());
})