Here I have files in PDF which contain line diagrams. I want to extract each line diagram from the PDF and convert it to a single svg file. My issue is each file has margin scale and I want to remove that.
Currently I am using:
import fitz # PyMuPDF
doc = fitz.open(pdf_path)
page = doc[0] # Get the first page (index 0)
svg_content = page.get_svg_image(matrix=fitz.Matrix(1, 1))
for converting file to svg, which is working well, but I don't need full PDF canvas file. I want to crop the image and save that image anywhere I want.
Here is the code which includes everything:
import fitz
def extract_cropped_svg(pdf_path, output_svg_path, page_number=0):
doc = fitz.open(pdf_path)
page = doc[page_number]
rect = page.get_drawings()
bounds = [d["rect"] for d in rect if d["rect"].is_valid]
if not bounds:
print("No vector drawings found")
return
# Combine bounding boxes
crop_rect = bounds[0]
for r in bounds[1:]:
crop_rect |= r
crop_rect = crop_rect + (-2, -2, 2, 2)
page.set_cropbox(crop_rect)
svg = page.get_svg_image(matrix=fitz.Matrix(1, 1))
with open(output_svg_path, "w", encoding="utf-8") as f:
f.write(svg)
print(f"SVG saved to {output_svg_path}")