pdfaccessibilityitext7

PDF created by iText7 with Header is not accessible


I created a pdf with iText7. The pdf has a header on each page which consists of two (sometimes more) rows. I added them as in the jump start tutorial, chapter 3.

The problem is, that there are no tags generated, so the screenreader (JAWS) does'nt find the header and blind users can not access it.

I tried to add some tags manually to mimic a table, but that seems to be ignored completly.

Here is my code to create the pdf:

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;

import com.itextpdf.io.font.PdfEncodings;
import com.itextpdf.kernel.events.Event;
import com.itextpdf.kernel.events.PdfDocumentEvent;
import com.itextpdf.kernel.font.PdfFont;
import com.itextpdf.kernel.font.PdfFontFactory;
import com.itextpdf.kernel.geom.PageSize;
import com.itextpdf.kernel.pdf.*;
import com.itextpdf.kernel.pdf.canvas.PdfCanvas;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.Paragraph;
import com.itextpdf.pdfa.PdfADocument;

public class ITextHeader {
    private PdfADocument pdf;
    private PdfFont bf;

    public static void main(String[] args) throws Exception {
        new ITextHeader().createPdf();
    }
    
    private void createPdf() throws Exception {
        PdfWriter writer = new PdfWriter(new FileOutputStream("header.pdf"));
        InputStream icm = new FileInputStream("sRGB_CS_profile.icm");
        pdf = new PdfADocument(writer, PdfAConformanceLevel.PDF_A_1A,
                new PdfOutputIntent("Custom", "", null, "sRGB IEC61966-2.1", icm));
        pdf.setTagged();
        bf = PdfFontFactory.createFont("arial.ttf", PdfEncodings.IDENTITY_H);
        try (Document pdfDocument = new Document(pdf, PageSize.A4, true)) {
            pdfDocument.setMargins(100, 15, 50, 15);
            pdf.addEventHandler(PdfDocumentEvent.START_PAGE, this::createHeader);
            
            pdfDocument.add(new Paragraph("Here is the content").setFont(bf).setFontSize(10));
        }
    }

    public void createHeader(Event event) {
        PdfDocumentEvent docEvent = (PdfDocumentEvent) event;
        PdfPage page = docEvent.getPage();
        PdfCanvas pdfCanvas = new PdfCanvas(
                page.newContentStreamBefore(), page.getResources(), pdf);
        pdfCanvas.beginText()
                .setFontAndSize(bf, 10)
                .beginMarkedContent(PdfName.Table)
                .moveText(15, 804)
                .beginMarkedContent(PdfName.TR)
                .beginMarkedContent(PdfName.TD)
                .showText("My Title")
                .endMarkedContent() // TD
                .moveText(466, 0)
                .beginMarkedContent(PdfName.TD)
                .showText("Date: 01.01.2022")
                .endMarkedContent() // TD
                .endMarkedContent() // TR
                .moveText(-466, -14)
                .beginMarkedContent(PdfName.TR)
                .beginMarkedContent(PdfName.TD)
                .showText("My Subtitle")
                .endMarkedContent() // TD
                .moveText(466, 0)
                .beginMarkedContent(PdfName.TD)
                .showText("Time: 12:30")
                .endMarkedContent() // TD
                .endMarkedContent() // TR
                .endMarkedContent() // TABLE
                .endText();
    }
}

This is the structure of the pdf as shown by PDF Accessibility Checker:

enter image description here

The Accesibility Checker also complains about not tagged content:

enter image description here


Solution

  • We solved the issue with the following workaround: the header on the first page is rendered as a PDF table, on the following pages we use the canvas to display the text. This solution is somewhat ankward because we have to implement the headers twice with different techniques, but now JAWS finds at least the header on the first page.