node.jstypescriptpdf.jspdfjs-dist

pdf.getPage is not a function when trying to use pdf.js


I want to extract all text (using node and pdfjs) from a given PDF file, so I installed pdfjs-dist and tried with this code:

import pdfjs from 'pdfjs-dist/build/pdf.js';
import pdfjsWorker from 'pdfjs-dist/build/pdf.worker.entry.js';

pdfjs.GlobalWorkerOptions.workerSrc = pdfjsWorker;

const pdf = await pdfjs.getDocument('testdoc.pdf');
const page = await pdf.getPage(1);

However that gives me

const page = await pdf.getPage(1);
                       ^

TypeError: pdf.getPage is not a function

Why is that and how to fix that?


Solution

  • You need to add .promise to the end of your .getDocument() call to expose the Promises API.

    const pdf = await pdfjs.getDocument('testdoc.pdf').promise;
    const page = await pdf.getPage(1);
    

    Source: https://mozilla.github.io/pdf.js/examples/