I couldn't find how to rotate a text page of a PDF by an arbitraty angle in the pymupdf documentation.
There is page.set_rotation(angle)
, however, which only allows for 0, 90, 180, 270
degrees, which is also combined in case of 90
and 270
degrees by a page orientation change.
So, I was hoping for something like:
import pymupdf
doc = pymupdf.open("Rest.pdf") # open document
page = doc[0] # get the 1st page of the document
page.set_rotation(0) # rotate the page
matrix = pymupdf.Matrix(1,0,0,1,0,0)
matrix.prerotate(2.5) # 2.5 degrees
page.add_transformation(matrix) # BUT THIS DOES NOT EXIST!
doc.save("Test_rotated.pdf")
Can this be done at all? Or do I maybe have to convert the text to an image and then rotate that image? Thank you for any hints.
Boy did I have to go through reams of documentation just to find such a simple example of what MuPDF can do in one line!
The more hours it took the smaller the result so adapt as you wish.
import pymupdf # NOT fitz
# set up receiver page
page = pymupdf.open().new_page(width=595, height=841)
# Show from source (input.pdf) pno (pagenumero) orient[rot]ated 2.5 degrees
page.show_pdf_page(pymupdf.Rect(0, 0, 595, 841), pymupdf.open("input.pdf"), pno=0, rotate=2.5, keep_proportion=True)
# need I say more?
page.parent.save("output.pdf")
NOTE other commands such as morph
can provide similar functions. But some are primarily for page text and others for drawing components. This is the best of my attempts without diving deeper into separated functions.
P.S of course, in addition to retaining the vectors, it can work with scanned images or the test image etc.
Here is the image version of the supplied source upright image now output as a rotated image PDF. So images, text and vectors will be retained in the rotated output. However, (like other libraries methods) it may not transform and transfer all metadata, as it has been regenerated. You may still have to add other transferred items.
I said there may be some potential limitations in all libraries using this method and some to be aware of.
Bonus is we may remove JavaScript exploits, proof tested on some but not all!.
Using fine rotate will "ignore" any source rotations, so may need to use landscape for import. Then rotate all pages or write detection for odd sources.
You may thus also find this method removes annotations bookmarks and links, as do other libraries since the page locations are no longer valid so usually "cleaned out" unless you wrangle them per page!
To use for cleaning / de-rotating:
import pymupdf # NOT fitz
doc = pymupdf.open() # Set up receiver doc
src = pymupdf.open("input.pdf") # Imported pdf
for i in range(len(src)): # Loop pages
page = doc.new_page() # A4 upright by default, alter if required (using 0.0 will un-rotate a source)
# Use show as a type of sanitizer to import only page cores from src, i=(Page#base0) orient[rot]ated 0.0 degrees CCW
page.show_pdf_page(page.rect,src,i,rotate=0.0)
# Garbage clearing then core compression
doc.save("output.pdf",garbage=3,deflate=True)