javapdfitextpdfbox

How to remove the read-only mode and signatures of a digitally signed PDF with pdfbox


From time to time we want to remove the write-protection/"encryption" and digital signatures of our PDF documents, so the document can be changed and re-signed. E.g. because the original document is missing or was changed and the digital signatures became corrupt (E.g. this document).

For this, we used the following iText 8 code (Indeed, flattening the AcroForm is not the best way, e.g. because interactive forms become disabled etc.):

public static byte[] cleanUpPdfItext(byte[] originalPdfData) throws Exception {
    // Read the PDF document
    try (
            ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
            PdfReader pdfReader = new PdfReader(new ByteArrayInputStream(originalPdfData)).setUnethicalReading(true);
            PdfWriter pdfWriter = new PdfWriter(byteArrayOutputStream);
            PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter)
    ) {
        // Create the signature utils
        SignatureUtil signatureUtil = new SignatureUtil(pdfDocument);

        // Check if encrypted and/or contains signatures
        boolean isEncrypted = pdfReader.isEncrypted();
        boolean hasSignatures = !signatureUtil.getSignatureNames().isEmpty();

        // Handle all cases
        if (isEncrypted && hasSignatures) { // Encrypted and signatures
            // Remove the signatures
            PdfAcroForm form = PdfAcroForm.getAcroForm(pdfDocument, true);
            form.flattenFields();

            // Write the changes to the output stream, so we can read them
            pdfDocument.close();

            // Get the manipulated document
            return byteArrayOutputStream.toByteArray();
        } else if (isEncrypted) { // Encrypted but no signatures
            // Write the changes to the output stream, so we can read them
            pdfDocument.close();

            // Get the manipulated document
            return byteArrayOutputStream.toByteArray();
        } else { // Not encrypted/no signatures
            // Return the original document data
            return originalPdfData;
        }
    }
}

Question: What is the equivalent code to do this with pdfbox? Remove the write-protection/"encryption" and remove all existing signatures (Missing yet), so the document can be edited and resigned?

I came up with this initial version:

public static byte[] cleanUpPdfbox(byte[] original) throws Exception {
    try (
            ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
            PDDocument pdDocument = Loader.loadPDF(original)
    ) {
        // Check if encrypted or read only
        AccessPermission accessPermission = pdDocument.getCurrentAccessPermission();
        boolean isEncrypted = pdDocument.isEncrypted() || accessPermission.isReadOnly();

        // Check if signatures exist
        boolean hasSignatures = false;
        PDAcroForm pdAcroForm = pdDocument.getDocumentCatalog().getAcroForm();
        if (pdAcroForm != null) {
            // Get a list of all signature fields
            List<PDSignature> pdSignatures = pdDocument.getSignatureDictionaries();
            hasSignatures = !pdSignatures.isEmpty();
        }

        // Remove all security if required
        if (isEncrypted) {
            pdDocument.setAllSecurityToBeRemoved(true);
        }

        // Remove all signatures
        if (hasSignatures) {
            // TODO: Code in question
        }

        // Write the document
        pdDocument.save(byteArrayOutputStream);
        return byteArrayOutputStream.toByteArray();
    }
}

Solution

  • If there are revisions you could cut off after the second last %%EOF. However this file doesn't use revisions. This solution removes the signature field from the fields array (in the hope that it's on the top level) and also removes it from the annotation array on the page. And removes the Perms entry from the document catalog.

        try (PDDocument doc = Loader.loadPDF(new File("Encrypted.pdf")))
        {
            PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
            
            List<PDField> oldFieldList = acroForm.getFields();
            List<PDField> newFieldList = new ArrayList<>();
            for (PDField field : oldFieldList)
            {
                if (!(field instanceof PDSignatureField))
                {
                    newFieldList.add(field);
                }
            }
            acroForm.setFields(newFieldList);
            for (PDPage page : doc.getPages())
            {
                List<PDAnnotation> oldAnnotationList = page.getAnnotations();
                List<PDAnnotation> newAnnotationList = new ArrayList<>();
                for (PDAnnotation ann : oldAnnotationList)
                {
                    if (ann instanceof PDAnnotationWidget && ann.getCOSObject().containsKey(COSName.V))
                    {
                        continue;
                    }
                    newAnnotationList.add(ann);
                }
                page.setAnnotations(newAnnotationList);
            }
    
            doc.setAllSecurityToBeRemoved(true);
            doc.getDocumentCatalog().getCOSObject().removeItem(COSName.PERMS);
            doc.save(new File("SO79055588-saved.pdf"));
        }
    

    It might be possible that a signature is below the top level, although I can't remember having ever seen this. If you want to handle this, check whether a field is of type PDNonTerminalField and call getChildren() and then do the same for-loop as with the top level, and do this recursively.

    (Update) Alternative solution that I originally made first (see mkl comment), signatures are deactivated but will still appear in the list:

    try (PDDocument doc = Loader.loadPDF(new File("Encrypted.pdf")))
    {
        PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
            
        for (PDField field : acroForm.getFieldTree())
        {
            if (field instanceof PDSignatureField)
            {
                ((PDSignatureField) field).setValue((PDSignature) null);
                ((PDSignatureField) field).getWidgets().get(0).setAppearance(null);
                ((PDSignatureField) field).getWidgets().get(0).setRectangle(new PDRectangle());
            }
        }
        doc.setAllSecurityToBeRemoved(true);
        doc.getDocumentCatalog().getCOSObject().removeItem(COSName.PERMS);
        doc.save(new File("SO79055588-saved.pdf"));
    }