pdfpdfboxflatten-pdf

Flattening Form fields removes content


I try to flatten form fields (PDAcroForm.flatten()) in a pdf which in the step before got filled from an .xfdf file. The expected result is to have the editable boxes replaced with just the text.

However I from the PDF where the text is filled in the form (output02.pdf) after flattening, all added text is now completely gone, so I get a blank spaces instead of the form values (output03.pdf).

Put a complete example on github, containing the PDF files (input and the generated output), but here is just the part of the flattening:

// in Main.java, function flatten()

PDDocument pdf_document = PDDocument.load(new File("output02.pdf"));  //from step before, merged & filled pdf files.

List<PDField> the_fields = new ArrayList<PDField>();
for (PDField field: pdf_document.getDocumentCatalog().getAcroForm().getFieldTree()) {
    the_fields.add(field);
}
System.out.println("Flattening fields: " + Arrays.stream(the_fields.toArray()).map(field -> ((PDField)field).getFullyQualifiedName()).collect(Collectors.joining(", ","[","]")));
pdf_document.getDocumentCatalog().getAcroForm().flatten(the_fields, true);
pdf_document.save(new File("output03.pdf"));

preview of pdf result The text filled in is gone, too

Edit:
Created those form elements with Adobe Acrobat Pro 10.1.1 on existing PDFs, via the form menu, and simply saved the pdfs as sample5.pdf and test.pdf.


Solution

  • This is a bug that was fixed since 2.0.5 two years ago. Due to that bug, the field values in the xfdf file were assigned as names instead of as strings in the /V entry (for the value) of the field dictionary. Because that, there is nothing to show in the appearance stream of the field. Thus nothing after flattening.

    Always use the latest version of PDFBox. I use the maven versions plugin in all my projects.