javapdfhashpdfbox

PDFBox - Signing a PDF using an external service (document has been altered or corrupted)


I want to sign a document with using an external API. You can send an hash to that service and get a signed hash back. You also get the complete certificate chain which was used by the service. (As you can see I also add a visual signature, but it works fine and can be ignored in this case, I just added it for context as you can see the visuals in the example PDF)

After loads of trial and error, this is the best result I can produce, where the signature is only invalid because of "The document has been altered or corrupted since it was signed".

For further inspection a link to a PDF signed by this code: https://pdfhost.io/v/5lM7xk4SX_testing

And here is the code responsible for signing:

  public void signPdf(String filePath, String outfilePath, Certificate[] chain, SignatureAppearance signatureAppearance) {
    try (
        OutputStream output = new FileOutputStream(outfilePath);
        PDDocument document = PDDocument.load(new File(filePath))
    ) {
      PDSignature signature = new PDSignature();
      signature.setFilter(PDSignature.FILTER_ADOBE_PPKLITE);
      signature.setSubFilter(PDSignature.SUBFILTER_ADBE_PKCS7_DETACHED);
      signature.setName(signatureAppearance.getSignerName());
      signature.setReason("Testing purposes");
      signature.setLocation("Test Location");
      signature.setSignDate(Calendar.getInstance());

      Rectangle2D humanRect = new Rectangle2D.Float(signatureAppearance.getSignatureXLocation(), signatureAppearance.getSignatureYLocation(), 200, 50);
      PDRectangle rect = createSignatureRectangle(document, humanRect);

      SignatureOptions options = new SignatureOptions();
      options.setPage(0);
      options.setVisualSignature(createVisualSignatureTemplate(document, 0, rect, signature, signatureAppearance));
      document.addSignature(signature, null, options);

      CMSSignedDataGenerator gen = new CMSSignedDataGenerator();
      Store certStore = new JcaCertStore(Arrays.stream(chain).toList());
      X509Certificate signerCert = (X509Certificate) chain[0];
      ExternalSigningSupport externalSigning = document.saveIncrementalForExternalSigning(output);
      byte[] pdfHash = new SHA256.Digest().digest(externalSigning.getContent().readAllBytes());

      ContentSigner contentSigner = new ContentSigner() {
        private final OutputStream stream = OutputStreamFactory.createStream(new SHA256.Digest());

        @Override
        public byte[] getSignature() {
          try {
            return webAPIService.getSignedHash(bytesToHex(pdfHash)).getBytes();
          } catch (Exception e) {
            throw new RuntimeException("Exception while signing", e);
          }
        }

        @Override
        public OutputStream getOutputStream() {
          return stream;
        }

        @Override
        public AlgorithmIdentifier getAlgorithmIdentifier() {
          return new AlgorithmIdentifier(PKCSObjectIdentifiers.sha256WithRSAEncryption);
        }
      };

      gen.addCertificates(certStore);
      gen.addSignerInfoGenerator(
          new JcaSignerInfoGeneratorBuilder(
              new JcaDigestCalculatorProviderBuilder().setProvider(new BouncyCastleProvider()).build())
              .build(contentSigner, signerCert));

      CMSTypedData msg = new CMSProcessableByteArray(pdfHash);
      CMSSignedData signedData = gen.generate(msg, false);

      byte[] cmsSignature = signedData.getEncoded();
      externalSigning.setSignature(cmsSignature);
    } catch (CertificateEncodingException e) {
      throw new RuntimeException(e);
    } catch (IOException e) {
      throw new RuntimeException(e);
    } catch (OperatorCreationException e) {
      throw new RuntimeException(e);
    } catch (CMSException e) {
      throw new RuntimeException(e);
    }
  }

  public String bytesToHex(byte[] bytes) {
    StringBuilder sb = new StringBuilder();
    for (byte b : bytes) {
      sb.append(String.format("%02x", b));
    }
    return sb.toString();
  }

I'm guessing that there is something wrong with how I handle the contentSigner, especially how the hash from the pdf is handed into it. Then the hash of the PDF does not match the signed-hash or something similar. But I cannot find the proper way to do it.

I also tried to follow the example here (http://www.java2s.com/example/java-api/org/apache/pdfbox/pdmodel/interactive/digitalsignature/externalsigningsupport/getcontent-0-0.html) where only the ExternalSigningSupport is used. There I replaced the sign call with my getSignedHash. But then I wouldn't even get a valid signers entity. I guess in the sign method there is more then just a signed hash returned.


Solution

  • Fixing the code

    In comments to the question example files signed by different versions of the code have been analyzed and some errors in the signing code have been identified and fixed, resulting in valid signature.

    The starting state of the code was not the code in the question, though, it meanwhile had changed to the first revision of the code in this gist.

    Processing the signature in the signing API response

    In the first example file one issue leapt to the eye: the signer certificate has a RSA/3072 public key but the signature value had a length of 512 bytes, i.e. 4096 bits. This obviously doesn't match.

    Possible causes include that the signing API actually signed with a private key not corresponding to the alleged signer certificate. Another possible cause was that the bytes returned by webAPIService.getSignedHash are not yet the plain signature bytes to use but have to be processed somehow first.

    As it turned out, the latter case was the cause here: the bytes returned by webAPIService.getSignedHash actually are the base64 encoded signature bytes, for use here they needed to be base64 decoded. Base64 encoding enlarges the data by a 4:3 ratio, so after decoding the size of signature and key matched.

    So in

    public byte[] getSignature() {
      try {
        java.util.Base64.Encoder encoder = java.util.Base64.getEncoder();
        String hash = encoder.encodeToString(stream.toByteArray());
        return webAPIService.getSignedHash(hash).getBytes();
      } catch (Exception e) {
        throw new RuntimeException("Exception while signing", e);
      }
    }
    

    the return line was replaced by

        return java.util.Base64.getDecoder().decode(webAPIService.getSignedHash(hash));
    

    Now the final signed hash can be successfully decrypted using the public key in the signer certificate and the result thereof can successfully be parsed, which are checks one can make for RSASSA-PKCS1-v1_5 signatures. Thus, now it was clear that the signing API response was properly processed.

    Sending the data to sign in the signing API request

    The next apparent issue was that the alleged hash value in the decrypted signature value was 151 bytes long. This is much too large, for SHA256 (as claimed by the code) that hash should be 32 bytes long.

    Comparing it to the rest of the signature, though, one could recognize those bytes: They were the signed attributes themselves, not their hash as it should have been! Apparently the signing API as argument of webAPIService.getSignedHash expects the hash to sign, not the original data.

    Thus, the getSignature method was further changed to hash the signed attributes and send the hash to the API:

    public byte[] getSignature() {
      try {
        MessageDigest digest = MessageDigest.getInstance("SHA-256");
    
        byte[] hashBytes = digest.digest(stream.toByteArray()); // hashing the string
        String hash = Base64.getEncoder().encodeToString(hashBytes); // Base64 encoding
    
        return java.util.Base64.getDecoder().decode(webAPIService.getSignedHash(hash));
      } catch (Exception e) {
        throw new RuntimeException("Exception while signing", e);
      }
    }
    

    With this change in place, the signature bytes correctly signed the signed attributes.

    Providing the hash of the signed document ranges

    Even after fixing the signature of the signed attributes the signature was invalid: The message-digest signed attribute value was incorrect.

    As it turned out, the value of that attribute was not the hash of the signed document ranges but the hash of the hash! This happened because the code both hashed explicitly itself and then the BouncyCastle CMSSignedDataGenerator hashed again. Thus, the explicit hashing in the code needed to be removed,

    ExternalSigningSupport externalSigning = document.saveIncrementalForExternalSigning(output);
    byte[] pdfHash = new SHA256.Digest().digest(externalSigning.getContent().readAllBytes());
    ...
    CMSTypedData msg = new CMSProcessableByteArray(pdfHash);
    CMSSignedData signedData = gen.generate(msg, false);
    

    was replaced by

    ExternalSigningSupport externalSigning = document.saveIncrementalForExternalSigning(output);
    byte[] pdfHash = externalSigning.getContent().readAllBytes();
    ...
    CMSTypedData msg = new CMSProcessableByteArray(pdfHash);
    CMSSignedData signedData = gen.generate(msg, false);
    

    With this change in place, the signature finally is valid!

    (Albeit the variable name pdfHash is misleading, but that surely will be cleaned up eventually...)