pythonformspdfpypdfhebrew

NeedAppearances=pdfrw.PdfObject('true') forces manual pdf save in Acrobat Reader


We have a pdf form file example.pdf which has 3 columns:

name_1, company_1, and client_1

Our data to fill is in Hebrew as well as English. Our goal is to have a file which can be opened RTL in both a Browser and Acrobat Reader. Our goal is met when we manually save the exported file from the following code, but we would like not to have to save it manually or, if no other option, save it programmatically.

import pdfrw


INVOICE_TEMPLATE_PATH = 'example.pdf'
INVOICE_OUTPUT_PATH = 'output.pdf'


ANNOT_KEY = '/Annots'
ANNOT_FIELD_KEY = '/T'
ANNOT_VAL_KEY = '/V'
ANNOT_RECT_KEY = '/Rect'
SUBTYPE_KEY = '/Subtype'
WIDGET_SUBTYPE_KEY = '/Widget'


def write_fillable_pdf(input_pdf_path, output_pdf_path, data_dict):
    template_pdf = pdfrw.PdfReader(input_pdf_path)
    template_pdf.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))
    annotations = template_pdf.pages[0][ANNOT_KEY]
    for annotation in annotations:
        if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
            if annotation[ANNOT_FIELD_KEY]:
                key = annotation[ANNOT_FIELD_KEY][1:-1]
                if key in data_dict.keys():
                    annotation.update(
                        pdfrw.PdfDict(AP=data_dict[key], V='{}'.format(data_dict[key]), Ff=1)
                    )
    pdfrw.PdfWriter().write(output_pdf_path, template_pdf)



data_dict = {
    'name_1': 'עידו',
    'company_1': 'IBM',
    'client_1': 'אסם'
}

if __name__ == '__main__':
    write_fillable_pdf(INVOICE_TEMPLATE_PATH, INVOICE_OUTPUT_PATH, data_dict)

We figured that NeedAppearances has something to do with needing to save manually. When the exported file is opened in Acrobat Reader a certain work is applied by Acrobat Reader on the file. For this reason upon exit the program asks if we would like to save the file. This operation is vital for us but we need it automatically.

What is this operation and how to do it programmatically in our code? before or after export..


Solution

  • I was facing the same issue when I had the NeedAppearances set to true. I found the below piece of code working for my pdfs. Please let me know if this works for you.

    from pikepdf import Pdf
    
    with Pdf.open('source_pdf.pdf') as pdf:
        pdf.generate_appearance_streams()
        pdf.save('output.pdf')
    

    I think generate_appearance_streams() was able to generate the appearance streams instead of letting the PDF reader do for it, hence no manual save required when opened with Adobe Acrobat Reader.