pythonpdfreportlab

create reproducible PDF using reportlab


When outputting PDF from python using reportlab, the resulting file is slightly different each time. Some of these differences are visible in plaintext, e.g. reportlab inserts creation and modification timestamps and (two instances of) an md5 document identifier:

diff --git a/reportlab.pdf b/reportlab.pdf
--- a/reportlab.pdf
+++ b/reportlab.pdf
-/CreationDate (D:20250525111111+00'00') [..] /ModDate (D:20250525111111+00'00')
+/CreationDate (D:20250525111112+00'00') [..] /ModDate (D:20250525111112+00'00')
 /ID
-[<cafebabe...><cafebabe...>]
+[<decafbad...><decafbad...>]
% ReportLab generated PDF document -- digest [..]

How can such differences be avoided?


Solution

  • There is a global switch removing variance in timestamps, comments and generated text/IDs:

    from reportlab import rl_config
    rl_config.invariant = True
    

    (It is mentioned in the RML docs, but not in the python manual.)