I am trying to load PDF file of my local storage then extract content in React.js without any backend.
I tried to find similar modules from google, but didn't find proper module yet. There are many node modules for parsing PDFs, and I can extract content of PDF in backend, but I am not sure we can use it in web browsers.
I tried this, and pdfjs-dist
was no longer functional. Instead, a better alternative to extract text from a PDF directly within React was react-pdftotext
.
1. Install the library:
npm install react-pdftotext
2. Import the library:
import pdfToText from 'react-pdftotext'
3. Create an input field:
<input type="file" accept="application/pdf" onChange={extractText}/>
4. Prepare a function:
function extractText(event) {
const file = event.target.files[0]
pdfToText(file)
.then(text => console.log(text))
.catch(error => console.error("Failed to extract text from pdf"))
}
Finally, bringing it all together:
import pdfToText from 'react-pdftotext'
function extractText(event) {
const file = event.target.files[0]
pdfToText(file)
.then(text => console.log(text))
.catch(error => console.error("Failed to extract text from pdf"))
}
function PDFParserReact() {
return (
<div className="App">
<header className="App-header">
<input type="file" accept="application/pdf" onChange={extractText}/>
</header>
</div>
);
}
export default PDFParserReact;
References: https://devnavigator.com/home/text-extraction-from-pdf-in-react-73a6519d-d8ab-4e52-8819-ff39bbb54f2a