pythonpdfrotationpymupdf

How to rotate a page by arbitrary angle in pymupdf?


I couldn't find how to rotate a text page of a PDF by an arbitraty angle in the pymupdf documentation.

There is page.set_rotation(angle), however, which only allows for 0, 90, 180, 270 degrees, which is also combined in case of 90 and 270 degrees by a page orientation change.

So, I was hoping for something like:

import pymupdf

doc = pymupdf.open("Rest.pdf") # open document
page = doc[0] # get the 1st page of the document
page.set_rotation(0) # rotate the page

matrix = pymupdf.Matrix(1,0,0,1,0,0)
matrix.prerotate(2.5)   # 2.5 degrees

page.add_transformation(matrix)   # BUT THIS DOES NOT EXIST!

doc.save("Test_rotated.pdf")

Can this be done at all? Or do I maybe have to convert the text to an image and then rotate that image? Thank you for any hints.


Solution

  • Boy did I have to go through reams of documentation just to find such a simple example of what MuPDF can do in one line! enter image description here

    The more hours it took the smaller the result so adapt as you wish.

    import pymupdf  # NOT fitz
    
    # set up receiver page
    page = pymupdf.open().new_page(width=595, height=841)
    
    # Show from source (input.pdf) pno (pagenumero) orient[rot]ated 2.5 degrees
    page.show_pdf_page(pymupdf.Rect(0, 0, 595, 841), pymupdf.open("input.pdf"), pno=0, rotate=2.5, keep_proportion=True)
    
    # need I say more?
    page.parent.save("output.pdf")
    

    NOTE other commands such as morph can provide similar functions. But some are primarily for page text and others for drawing components. This is the best of my attempts without diving deeper into separated functions.

    P.S of course, in addition to retaining the vectors, it can work with scanned images or the test image etc.
    Here is the image version of the supplied source upright image now output as a rotated image PDF. So images, text and vectors will be retained in the rotated output. However, (like other libraries methods) it may not transform and transfer all metadata, as it has been regenerated. You may still have to add other transferred items.

    enter image description here

    I said there may be some potential limitations in all libraries using this method and some to be aware of.

    Bonus is we may remove JavaScript exploits, proof tested on some but not all!.

    Using fine rotate will "ignore" any source rotations, so may need to use landscape for import. Then rotate all pages or write detection for odd sources.

    You may thus also find this method removes annotations bookmarks and links, as do other libraries since the page locations are no longer valid so usually "cleaned out" unless you wrangle them per page!

    To use for cleaning / de-rotating:

    import pymupdf  # NOT fitz
    
    doc = pymupdf.open()            # Set up receiver doc
    src = pymupdf.open("input.pdf") # Imported pdf
    
    for i in range(len(src)):       # Loop pages
        page = doc.new_page()       # A4 upright by default, alter if required (using 0.0 will un-rotate a source)
    
    # Use show as a type of sanitizer to import only page cores from src, i=(Page#base0) orient[rot]ated 0.0 degrees CCW
        page.show_pdf_page(page.rect,src,i,rotate=0.0)
    
    # Garbage clearing then core compression
    doc.save("output.pdf",garbage=3,deflate=True)