javapdfnullpointerexceptionitextpdf-manipulation

iText : unable to retrieve /Resources from a page


I'm using iText 5.0.1 to manipulate existing PDF. When analyzing an existing PDF using RUPS, I can see that the first page contains a /Resources :

enter image description here

However, when manipulating the PDF by using the following example, I'm getting a NPE because pageDictionary.get(PdfName.RESOURCES) is returning null.

Here is what my pageDictionnary object contains when debugging :

enter image description here

Unfortunately, because of confidentiality, I can't post the PDF now, but does anyone have an idea why I'm getting this NPE ? Or does anyone have an idea how to investigate further ? (I'm far from being an expert with iText and PDF structure ... and slowly getting out of idea)

Thank you very much !


Solution

  • The sample code you use assumes that the Page objects are immediate kids of the dictionary pointed to by the Pages catalog key:

    PdfDictionary pages = (PdfDictionary) PdfReader.getPdfObject(reader.getCatalog().get(PdfName.PAGES));
    PdfArray kids = (PdfArray) PdfReader.getPdfObject(pages.get(PdfName.KIDS));
    PdfDictionary pageDictionary = (PdfDictionary) PdfReader.getPdfObject((PdfObject) kids.getArrayList().get(pageNum - 1));
    

    This assumption often is ok because many PDF producers generate simple page trees, but in general the page tree can indeed be a tree with a depth larger than 1, i.e. its leafs, the Page nodes, may be deeper down in the structure, kids of kids of kids of the root Pages dictionary etc.

    In case of your PDF that is the case, the Page dictionary of page 1 (object 3) is kid of the Pages dictionary object 6 which in turn is kid of the root Pages dictionary object 70.

    Thus, that code assumes the intermediary Pages dictionary object 6 to already be a Page object.

    This is not the only issue of that sample code, though. E.g. it also assumes that the Resources dictionary is attached to the Page object itself. This need not be true, it may also be attached to any parent Pages object including the page tree root:

    Resources dictionary (Required; inheritable) A dictionary containing any resources required by the page (see 7.8.3, "Resource Dictionaries"). If the page requires no resources, the value of this entry shall be an empty dictionary. Omitting the entry entirely indicates that the resources shall be inherited from an ancestor node in the page tree.

    (Table 30 – Entries in a page object - in ISO 32000-1, the current PDF specification)

    So, the sample you use in general is useless as it does not honor the PDF specification.


    That been said, your sample is from the time when the newest version of iText was 1.02b while you are using iText 5.0.1... Why did you not look for a more current sample? It is a wonder that after 4 major versions it even can be tweaked to compile easily!


    In current iText versions you can get the dictionary of a given page using the PdfReader method getPageN(final int pageNum) or getPageNRelease(final int pageNum).

    You should not expect the current PdfReader method getPageResources(final int pageNum) to return the resources of the given page, though, as it (just like your sample code) only looks at the Page dictionary for the Resources dictionary


    Is there a specific reason for you to use iText 5.0.1? That version is pretty old and many bug fixes and features have been applied since then.