iosparsingpdfcgpdfdocument

How to get the trailer dictionary from a CGPDFDocumentRef?


I know this question has been asked before, but I'll be more specific.

I have a CGPDFDocumentRef document and I want to find the trailer, ideally in the form of a CGPDFDictionaryRef so that I can look into its Encrypt dictionary and see if it allows text extraction.

The apple libraries don't seem to have a method to allow me to get at this dictionary.

You can get the catalog with CGPDFDocumentGetCatalog() Is there an equivalent to CGPDFDocumentGetTrailer()?

If not I guess I will have to manually parse the PDFDocument for the trailer myself.

These guys: How to find the trailer dictionary?

talk about writing such a low level parser.

How can I get at the code that they are parsing?

Can I convert the last page in the PDF to a CGPDFStream and then convert that stream into NSData and then maybe into ascii and then start implementing the Trailer parser that they are talking about?

Thanks!


Solution

  • A two-fold answer:

    1) No, Apple doesn't provide a function to get to the trailer. If you still want to access it, you would have to open the PDF file as a binary file and build a parser that goes through it (as described in the ISO 32000-1 standard that describes the PDF file format).

    2) More than likely, you don't need to (luckily, as it's not an easy feat). Apple thoughtfully provides you with these two functions:

    bool CGPDFDocumentIsEncrypted ( CGPDFDocumentRef document );
    

    This will tell you whether the document is encrypted or not.

    bool CGPDFDocumentAllowsCopying ( CGPDFDocumentRef document );
    

    This will tell you whether the document allows copying text from it (and thus text extraction).