javapdfboxzugferd

ZUGFeRD Mustang Validator Gives Font Reference Error


We are working on implementing e-invoicing in Germany using ZUGFeRD and PDF/A. When validating our very simple PDF with the Mustag Validator (https://www.mustangproject.org/commandline/) we end up getting this error message:

<?xml version="1.0" encoding="UTF-8"?>

<validation filename="Beleg_245210299_20241201_out.pdf" datetime="2025-02-14 13:16:52">
  <pdf>ValidationResult [flavour=3u, totalAssertions=695, assertions=[TestAssertion [ruleId=RuleId [specification=ISO 19005-3:2012, clause=6.2.2, testNumber=2], status=failed, message=A content stream that references other objects, such as images and fonts that are necessary to fully render or process the stream, shall have an explicitly associated Resources dictionary as described in ISO 32000-1:2008, 7.8.3, location=Location [level=CosDocument, context=root/document[0]/pages[0](9 0 obj PDPage)/contentStream[0](14 0 obj PDContentStream)/operators[0]/xObject[0]/contentStream[0](17 0 obj PDContentStream)], locationContext=null, errorMessage=A content stream refers to resource(s) F1 not defined in an explicitly associated Resources dictionary]], isCompliant=false]
    <info>
      <signature>unknown</signature>
      <duration unit="ms">652</duration>
    </info>
    <summary status="invalid"/>
  </pdf>  
  <xml>
    <info>
      <version>2</version>
      <profile>urn:cen.eu:en16931:2017</profile>
      <validator version="2.16.2"/>
      <rules>
        <fired>177</fired>
        <failed>0</failed>
      </rules>
      <duration unit="ms">1270</duration>
    </info>
    <summary status="valid"/>
  </xml>
  <summary status="invalid"/>
</validation>

However, if we were to delete the header from our PDF where it says "Blatt: 1 von 1", the validator error goes away. Therefore it must have something to do with how the header is referencing the fonts, right? At least that's what I'm thinking.

Our source PDF looks like this:
Source PDF

Download it from here: https://drive.google.com/file/d/1oeLoXZYnTYijc4WbiBsbacnVkiyx2ixO/view?usp=sharing

Our target PDF including the ZUGFeRD XML can be downloaded from here: https://drive.google.com/file/d/1J-tyiKhBrffZXRoD6wK908azo8JoN3k4/view?usp=sharing

Can you tell me why Mustag throws the error A content stream that references other objects, such as images and fonts that are necessary to fully render or process the stream, shall have an explicitly associated Resources dictionary as described in ISO 32000-1:2008?

The spec (https://pdfa.org/wp-content/uploads/2017/07/TechNote0010.pdf page 8) is not really helpful here.

We embedded this example XML into the PDF/A file using PDFBox to create a "valid" ZUGFeRD file.

https://www.mustangproject.org/files/ZUGFeRD-invoice.xml

However, when running the Mustang validator using this command

$ java  -jar Mustang-CLI-2.16.2.jar --action validate --no-notice --source Beleg_245210299_20241201_out.pdf

we end up getting the error message above.

To attach the XML, we use the following code, where data contains the UTF-8 bytes of the XML to attach:

    private void attachFile(String filename, String relationship, String description, String subType, byte[] data) throws IOException {
        this.fileAttached = true;

        PDComplexFileSpecification fs = new PDComplexFileSpecification();
        fs.setFile(filename);

        COSDictionary dict = fs.getCOSObject();
        dict.setName("AFRelationship", relationship);
        dict.setString("UF", filename);
        dict.setString("Desc", description);

        ByteArrayInputStream bais = new ByteArrayInputStream(data);
        PDEmbeddedFile ef = new PDEmbeddedFile(pdf, bais);
        ef.setSubtype(subType);
        ef.setSize(data.length);
        ef.setCreationDate(Calendar.getInstance());
        ef.setModDate(Calendar.getInstance());

        fs.setEmbeddedFile(ef);
        dict = fs.getCOSObject();

        COSDictionary efDict = (COSDictionary) dict.getDictionaryObject(COSName.EF);
        COSBase lowerLevelFile = efDict.getItem(COSName.F);
        efDict.setItem(COSName.UF, lowerLevelFile);

        PDDocumentNameDictionary names = new PDDocumentNameDictionary(pdf.getDocumentCatalog());
        PDEmbeddedFilesNameTreeNode efTree = names.getEmbeddedFiles();
        if (efTree == null) {
            efTree = new PDEmbeddedFilesNameTreeNode();
        }

        Map<String, PDComplexFileSpecification> namesMap = new HashMap<>();
        Map<String, PDComplexFileSpecification> oldNamesMap = efTree.getNames();
        if (oldNamesMap != null) {
            namesMap.putAll(oldNamesMap);
        }

        namesMap.put(filename, fs);
        efTree.setNames(namesMap);
        names.setEmbeddedFiles(efTree);
        pdf.getDocumentCatalog().setNames(names);

        COSBase afEntry = pdf.getDocumentCatalog().getCOSObject().getItem("AF");
        COSArray cosArray;
        if (afEntry == null) {
            cosArray = new COSArray();
            cosArray.add(fs);
            pdf.getDocumentCatalog().getCOSObject().setItem("AF", cosArray);
        } else if (afEntry instanceof COSArray) {
            cosArray = (COSArray) afEntry;
            cosArray.add(fs);
            pdf.getDocumentCatalog().getCOSObject().setItem("AF", cosArray);
        } else {
            if (!(afEntry instanceof COSObject) || !(((COSObject) afEntry).getObject() instanceof COSArray)) {
                throw new IOException("Unexpected object type for PDFDocument/Catalog/COSDictionary/Item(AF)");
            }
            cosArray = (COSArray) ((COSObject) afEntry).getObject();
            cosArray.add(fs);
        }
    }

We call the method above like this:

        this.attachFile(filename, "Alternative",
                "Invoice metadata conforming to ZUGFeRD standard (http://www.ferd-net.de/front_content.php?idcat=231&lang=4)",
                "text/xml", this.xmlProvider.getData());

Solution

  • This code fixes your file, and files that have the same structure.

    PDDocument doc = Loader.loadPDF(new File("Beleg_245210299_20241201_out.pdf"));
    for (PDPage page : doc.getPages())
    {
        PDResources resources = page.getResources();
        for (COSName name : resources.getXObjectNames())
        {
            PDXObject xObject = resources.getXObject(name);
            if (xObject instanceof PDFormXObject)
            {
                PDFormXObject form = (PDFormXObject) xObject;
                if (form.getResources() != null)
                {
                    continue;
                }
                form.setResources(resources);
            }
        }
    }
    doc.save(new File("Beleg_245210299_20241201_out-modified.pdf"));
    doc.close();