I am trying to introduce the pdfjs-dist
library into my nodejs server. However it's giving an import error
Error [ERR_REQUIRE_ESM]: require() of ES Module C:\Users\zjric\auto-filing\node_modules\pdfjs-dist\build\pdf.mjs not supported.
Instead change the require of C:\Users\zjric\auto-filing\node_modules\pdfjs-dist\build\pdf.mjs to a dynamic import() which is available in all CommonJS modules.
at Object.<anonymous> (C:\Users\zjric\auto-filing\utils\pdf_to_text.js:1:18) {
code: 'ERR_REQUIRE_ESM'
}
I assume this has to do with my package.json
file and the es5/es6 differences shenanigans.
const pdfjsLib = require('pdfjs-dist');
const getTextFromPDF = async (path) =>{
let doc = await pdfjsLib.getDocument(path).promise;
let page1 = await doc.getPage(1);
let content = await page1.getTextContent();
return content.items.map(function(item){
return item.str;
});
}
getTextFromPDF('./demo.pdf').then(data => console.log(data));
module.exports = { getTextFromPDF }
changing my package.json
file to "type": "module"
isn't realistic as all of my other infrastructure is already formatted in the require('module')
and runs perfectly fine. I presume I need to modify the library itself but I'm unaware as to how to manipulate that.
I would usually run into situations like this when working on older project but want to use newer modules with es6 imports/exports.
What I would do in this situation check for a .default in the requiring file. So in your case it would be
const pdfjsLib = require('pdfjs-dist').default and if that doesn't work just use the recommended solution what node recommends which is to use dynamic imports.
Here is an example
const getTextFromPDF = async(path) => {
const pdfjs = await import ('pdfjs-dist');
let doc = await pdfjsLib.getDocument(path).promise;
let page1 = await doc.getPage(1);
let content = await page1.getTextContent();
return content.items.map(function(item) {
return item.str;
});
}
getTextFromPDF('./demo.pdf').then(data => console.log(data));
module.exports = {
getTextFromPDF
}