Recently I downloaded a trial license of iText. I try to achieve the following goals:
I tried the following code: (C#)
LicenseKey.LoadLicenseFile(@"D:\Development\itextkey-0.xml");
PdfDocument pdfDoc = new PdfDocument(new PdfReader(SRC), new PdfWriter(DEST, new WriterProperties().SetPdfVersion(PdfVersion.PDF_1_7)));
pdfDoc.SetTagged();
pdfDoc.GetCatalog().SetLang(new PdfString("HE-IL"));
pdfDoc.GetCatalog().SetViewerPreferences(
new PdfViewerPreferences().SetDisplayDocTitle(true));
PdfDocumentInfo info = pdfDoc.GetDocumentInfo();
info.SetTitle("iText7 PDF/UA example");
pdfDoc.Close();
But yet, after checking at Acrobat Reader the output file marked as "Not Tagged" PDF file.
Please advise how I should use iText to achieve my goals.
Can't be done.
Let me give you the easiest proof:
Suppose the input document contains an image of two cats fighting over a ball of yarn.
pdf/UA requires you to insert sensible alternative text for your imagines.
There is currently no system available that is able to provide a sensible caption for any random image you throw at it.
Not to mention that whatever system comes up with a caption for images, would have to linked to a perfect translation service. Since most image recognition services are in English, and this might not be the language you are writing documents in. Which also implies you need a system that is capable of detecting the language you are writing in.
We've now added 3 insanely hard problems, simply to be able to handle images:
Now imagine the other kind of fun stuff, like
Furthermore, PDF/UA requires fonts to be embedded. What if you are faced with a PDF that uses fonts that aren't embedded. Do you have access to font programs that can be used to substitute those fonts?
In your snippet, you use PdfReader
, and you provide a path to a file SRC
. You need to convert Word, PPT, and other files, but iText doesn't convert Word, PPT, etc to PDF. PdfReader
only accepts PDF files (as the name indicates).