pythonpdfborb

debugging a PDF that will not open in Adobe


disclaimer: I'm the author of borb

I'm currently working on the next (major) release. And for some reason I can not seem to get Adobe Reader to open the PDF.

It throws 'There was a problem reading this document (14)'.

You can get more information by ctrl+clicking the 'ok'. I did so. It told me 'Expected a dict object'.

I'm currently working on a linux device, so using Adobe Pro (preflight) is a no-go for me.

This is my code:

from borb.pdf.document import Document
from borb.pdf import Paragraph
from borb.pdf import Page
from borb.pdf import PDF

d: Document = Document()

p: Page = Page()
d.append_page(p)

# useful constant(s)
x: int = p.get_size()[0] // 10
y: int = p.get_size()[1] // 10
w: int = p.get_size()[0] - 2 * (p.get_size()[0] // 10)
h: int = p.get_size()[1] - 2 * (p.get_size()[1] // 10)

Paragraph("Lorem ipsum").paint(available_space=(x, y, w, h), page=p)

PDF.write(what=d, where_to="new-borb.pdf")

This is the document produced:

%PDF-1.7
%\E2\E3\CF\D3
1 0 obj
<</Pages 2 0 R /Type /Catalog>>
endobj

2 0 obj
<</Count 1 /Kids [3 0 R] /Type /Pages>>
endobj

3 0 obj
<</Contents 7 0 R /MediaBox [0 0 595 842] /Resources 4 0 R /Type /Page>>
endobj

4 0 obj
<</Font 5 0 R>>
endobj

5 0 obj
<</F1 6 0 R>>
endobj

6 0 obj
<</BaseFont /Helvetica /Encoding /WinAnsiEncoding /Name /F1 /Subtype /Type1 /Type /Font>>
endobj

7 0 obj
<< /Filter /FlateDecode /Length 84>>
stream
x\DA+\E4r
\E12\D03P\80\E1\A2t.}7CC\85\904.C#\A0\A0\902\B5T071Q\C9\E5\D2\F0\C9/J\CD\D5T\C9\E2r
\E1
\E4*$\CE\00K\B8
\A4k\B6\80k\CE,(.E\B2\00\BCA'\D8
endstream
endobj

8 0 obj
<</CreationDate (D:20250605094605Z00) /ModDate (D:20250605094605Z00) /Producer (borb)>>
endobj

xref
0 9
0000000000 65535 f
0000000015 00000 n
0000000063 00000 n
0000000119 00000 n
0000000208 00000 n
0000000240 00000 n
0000000270 00000 n
0000000376 00000 n
0000000531 00000 n
trailer
<</ID [<1BE2A89A42B41E620D886CD2F315D35D> <1BE2A89A42B41E620D886CD2F315D35D>] /Info 8 0 R /Root 1 0 R /Size 9>> 
startxref
635
%%EOF

This is the content stream (inflated)

q
BT
0.0 0.0 0.0 rg
/F1 1 Tf
12 0 0 12 59 744 Tm
(Lorem) Tj
ET
Q
q
BT
0.0 0.0 0.0 rg
/F1 1 Tf
12 0 0 12 94 744 Tm
( ) Tj
ET
Q
q
BT
0.0 0.0 0.0 rg
/F1 1 Tf
12 0 0 12 98 744 Tm
(ipsum) Tj
ET
Q

This is a link to the file (hosted on Google Drive):

https://drive.google.com/file/d/1omSHhbaXwGixHVV1lVdgFqQTS800D1En/view?usp=sharing


Solution

  • Your Page dictionary is missing the Parent entry:

    3 0 obj
    <</Contents 7 0 R /MediaBox [0 0 595 842] /Resources 4 0 R /Type /Page>>
    endobj
    

    After adding it:

    3 0 obj
    <</Contents 7 0 R /MediaBox [0 0 595 842] /Resources 4 0 R /Type /Page /Parent 2 0 R>>
    endobj
    

    (and adapting the offsets) the PDF loads and displays ok.