pdf

PDF: Object Stream with FlateDecode


In my PDF, there is an object

<</Filter/FlateDecode/First 721/Length 3424/N 79/Type/ObjStm>>stream

The raw data on a next line start with bytes

eKoq...  precisely [101, 75, 111, 113, 22, 229, 156, 253, 116, ...

My Flate decoder fails on this input. How should it be processed then?

http://s000.tinyupload.com/?file_id=25511328881895019912


Solution

  • This PDF is encrypted. PDF file trailer is:

    endobj
    startxref
    116
    %%EOF
    

    Cross reference stream @byte offset 116 (with some formatting) is:

    <</DecodeParms<</Columns 5/Predictor 12>>
       /Encrypt 389 0 R
       % ... etc
       /Type/XRef /W[1 3 1]
     >> stream
    

    Encryption dictionary 389 0 R (formatted) is:

    389 0 obj <<
      /CF <<
        /StdCF <<
          /AuthEvent /DocOpen
          /CFM /AESV2
          /Length 16
        >>
      >>
      /EncryptMetadata false
      /Filter /Standard
      /O (...)  % binary owner key
      /P -1084
      /R 4
      /StmF /StdCF
      /StrF /StdCF
      /U (...)  % binary user key
      /V 4
      /Length 128
    >>
    endobj
    

    The PDF 32000 ISO States:

    7.6.1 General A PDF document can be encrypted (PDF 1.1) to protect its contents from unauthorized access. Encryption applies to all strings and streams in the document's PDF file, with the following exceptions:
    • The values for the ID entry in the trailer
    • Any strings in an Encrypt dictionary
    • Any strings that are inside streams such as content streams and compressed object streams, which themselves are encrypted

    The referenced object is a content stream in an encrypted PDF. In order to process this stream, you need to implement encryption (AESV2 in this case) and decrypt streams before applying other filters.

    Note: this PDF is encrypted with a blank user password, so it opens in most viewers without the need to enter a user password.