pythonpdfadobepdf-formpdfrw

How to fill PDF forms using Python


I have a PDF form created using Adobe LiveCycle Designer ES 10.4. I need to fill it using Python so that we can reduce manual labor. I searched the web and read some article most of them were focused around pdfrw library, I tried using it and extracted some information from PDF form as shown below

Code

from pdfrw import PdfReader
pdf = PdfReader('sample.pdf')
print(pdf.keys())
print(pdf.Info)
print(pdf.Root.keys())
print('PDF has {} pages'.format(len(pdf.pages)))

Output

['/Root', '/Info', '/ID', '/Size']
{'/CreationDate': "(D:20180822164509+05'30')", '/Creator': '(Adobe LiveCycle Designer ES 10.4)', '/ModDate': "(D:20180822165611+05'30')", '/Producer': '(Adobe XML Form Module Library)'}
['/AcroForm', '/MarkInfo', '/Metadata', '/Names', '/NeedsRendering', '/Pages', '/Perms', '/StructTreeRoot', '/Type']
PDF has 1 pages

I am not sure how further I can use pdfrw to access the fillable fields from the PDF form and fill them using Python is it possible. Any suggestions would be helpful.


Solution

  • You can find the form fields here:

    pdf.Root.AcroForm.Fields
    

    or here

    pdf.Root.Pages.Kids[page_index].Annots
    

    This is a PdfArray object. Basically a List. The Name of the field is found here:

    pdf.Root.AcroForm.Fields[field_index].T
    

    Other keys include the value .V There's a bunch of display information, like the font etc under .AP.N.Resources

    However, if you update the value for a field and output the pdf file. It might only display the value when the field has focus i.e is clicked on.

    I haven't figured out how to fix that yet.