pythonpdfpymupdf

Overlaying two PDFs with an alpha mask at 50%


I'm trying to recovers notes I took on an iPad over a PDF, that I saved as a new PDF before the application crashed. This new PDF is corrupted, but I could repair it so that it contains all my notes (highlights and margin scripted notes) but not the original PDF.

I am trying to use the fitz library (a.k.a PyMuPDF) to recover the full notes by overlaying the original PDF with my notes (using an alpha mask of 50% so that I can see through my highlights).

Unfortunately I could not manage to overlay two pages with transparency! The notes page is always masking the original PDF, so that I only see highlights and scripted notes on a BLANK page.

Example of 1 page:

original PDF (page 276

notes (page 276

I have tried the following code and a few variants without success (note in the following code I'm only trying to create one page -- page 276 -- of the whole document to speed up the test):

import fitz  # PyMuPDF 
journal_document = fitz.open(journal_path) # type: ignore 
notes_document = fitz.open(notes_path) # type: ignore 
combined_document = fitz.open() # type: ignore

for page_num in range(len(journal_document)):
    
    if page_num<276:
        continue
    
    # load pages to overlay
    journal_page = journal_document.load_page(page_num)
    notes_page = notes_document.load_page(page_num) 
    
    # extract bottom image 
    journal_pix = journal_page.get_pixmap()
    journal_image = fitz.Pixmap(journal_pix, 0)
    
    # create a new page in output doc
    combined_page = combined_document.new_page(width=journal_page.rect.width,
    height=journal_page.rect.height)
    combined_page.show_pdf_page(journal_page.rect, journal_document, page_num)
    
    # extract notes to be overlayed 
    notes_pix = notes_page.get_displaylist().get_pixmap()
    notes_image = fitz.Pixmap(notes_pix)
    notes_image.set_alpha(bytearray(int(128)) * 595 * 842)
    
    # Insérer l'image du journal sur la nouvelle page
    combined_page.insert_image(notes_page.rect, stream=notes_image.tobytes(), 
    alpha = int(128))
    
    print(f"page {page_num} saved...")
    
    break

combined_document.save(output_path)

Solution

  • Thanks @furas - Pillow did the trick. Yet I'm fascinated by the complexity of PDFs arcanes and would dream to find also a solution using PyMuPDF: the code above was so close to finding how to stack two images with some alpha mask & transparency...

    For the record, here is a pillow code snippet that worked, where image1 and image2 are pixmap extracted like above and #page_num is page number iterator:

    from PIL import Image
    image1 = Image.open(buf_img1)
    image2 = Image.open(buf_img2)
    mask = Image.new('L', image1.size, 128)  # 128 corresponds to 50% transparency
    result = Image.composite(image1, image2, mask)
    result.save(f'images/p{page_num}.jpg')