javascriptpdfpdf.jspdf-lib.js

Use PDF.JS to Fill out a PDF Form


I have what seems like it should be a simple problem - I would like to be able to fill out PDF forms programmatically in javascript.

My first attempt was with pdf-lib which has a nice API for filling out forms - however, the form I am trying to fill out has fields like this:

{
...
"employment.employment": [
  {
    id: '243R',
    value: 'Off',
    defaultValue: null,
    exportValues: 'YES',
    editable: true,
    name: 'employment',
    rect: [ 264.12, 529.496, 274.23, 539.604 ],
    hidden: false,
    actions: null,
    page: -1,
    strokeColor: null,
    fillColor: null,
    rotation: 0,
    type: 'checkbox'
  },
  {
    id: '244R',
    value: 'Off',
    defaultValue: null,
    exportValues: 'NO',
    editable: true,
    name: 'employment',
    rect: [ 307.971, 529.138, 318.081, 539.246 ],
    hidden: false,
    actions: null,
    page: -1,
    strokeColor: null,
    fillColor: null,
    rotation: 0,
    type: 'checkbox'
  }
]
}

which pdf-lib fails to parse properly. It will only allow me to set the value of 243R, treating 244R as if it doesn't exist (I assume because the names are not unique). That library also seems abandoned. C'est la vie.

Onward to pdf.js then. I can load the doc and set the value, but calling saveDocument or getData only returns the original, non-modified doc. How can I save the modified document?

const run = async () => {
  const loading = pdfjs.getDocument('form-cms1500.pdf')
  const pdf = await loading.promise
  const fields = await pdf.getFieldObjects()

  console.log(fields['employment.employment'] )
  fields['employment.employment'][0].value = 'On'
  console.log(fields['employment.employment'] )
  await fs.writeFileSync('test.pdf', await pdf.saveDocument()) // saveDocument throws this Warning: saveDocument called while `annotationStorage` is empty, please use the getData-method instead.

}

Solution

  • Ok I solved it! And by I solved it, I mean, I found a github issue where someone had a very similar problem in pdf-lib and adapted what I learned from that issue to resolve mine. Essentially you can grab the requisite field from the lower level acroForm API.

    Ultimate solution:

      const bytes = fs.readFileSync('form-cms1500.pdf')
      const pdfDoc = await PDFDocument.load(bytes);
      const form = pdfDoc.getForm();
      const allfields = form.acroForm.getAllFields()
      allfields[32][0].setValue(allfields[32][0].getOnValue())
      const pdfBytes = await pdfDoc.save()
      await fs.writeFileSync('test.pdf', pdfBytes)
    

    where field #32 is the checkbox I wanted to mark off. I figured out it was #32 just by printing the field names and their indexes.

    It turns out pdf.js is extremely unfriendly to updating fields so the best bet is just not to use it for that.