tagsaccessibilitylibharu

no "tags" / "structure tags" in libHaru PDF?


I am using libHaru (by including the source in my C++ code) to generate PDF files. I am hoping to make these PDF files accessible by adding "tags" (aka "structure tags"). From what I can see in the documentation and source code, libHaru does not support this. Can someone confirm that libHaru indeed does not support tags? And if it's not supported directly, I wonder if there is a way to add tags by modifying the libHaru code? Has anybody done this?


Solution

  • I looked through the 22 page manual for libHaru, and there was no mention of tags, so I think it's safe to assume that it doesn't support tagging.

    Attempting to make any library tag PDFs (and do it well) would be a non-trivial task. You'd essentially be re-inventing the wheel. Consider the fact that Adobe Acrobat Pro is only just mediocre at tagging PDFs and requires a ton of human intervention to get it right.

    There is a product called CommonLook Dynamic that is made for creating accessible PDFs from live data on a webserver, but I can't vouch for it myself. I have used other products from this company, and they've been very good, but they're not at all cheap.

    Generally speaking, PDF tagging is often a very complex thing. To make it work with an automated algorithm, the source code formatting has to be perfectly formed and dead simple. If your source material is at all complex or malformed, it won't come out right.

    As an example, it's not possible for PDF generation software to do a good job at things like crafting good alt text for images, creating useful PDF metadata, or tagging complex tables. These are things that require human intervention.