javapdfboxbatik

PDFBox - Saving a page as SVG format


I'm trying to save each page of a given PDF document as an SVG file. Information on the topic seems scarce, in comparison to SVG to PDF which is quite popular.

Is there a simple way to do this using PDFBox, Batik or any combination of standard library features?


Solution

  • Here's some code that saves all pages of the PDF, I probably have it from here or from batik itself.

    try (PDDocument pdfboxDocument = PDDocument.load(new File(dir, PDFFILE)))
    {
        PDFRenderer r = new PDFRenderer(pdfboxDocument);
        for (int i = 0; i < pdfboxDocument.getNumberOfPages(); ++i)
        {
            String svgNS = "http://www.w3.org/2000/svg";
            DOMImplementation impl = GenericDOMImplementation.getDOMImplementation();
            Document myFactory = impl.createDocument(svgNS, "svg", null);
            SVGGeneratorContext ctx = SVGGeneratorContext.createDefault(myFactory);
            ctx.setEmbeddedFontsOn(true);
            SVGGraphics2D g2d = new SVGGraphics2D(ctx, true);
            r.renderPageToGraphics(i, g2d);
            String filename = "test-" + (i + 1) + ".svg";
            try (Writer out = new OutputStreamWriter(new FileOutputStream(new File(dir, filename)), "UTF-8"))
            {
                g2d.stream(out, true);
            }
        }
    }
    

    pom.xml excerpt:

    <dependencies>
        <dependency>
            <groupId>org.apache.xmlgraphics</groupId>
            <artifactId>batik-svggen</artifactId>
            <version>${batik.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.xmlgraphics</groupId>
            <artifactId>batik-codec</artifactId>
            <version>${batik.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.xmlgraphics</groupId>
            <artifactId>batik-dom</artifactId>
            <version>${batik.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.pdfbox</groupId>
            <artifactId>pdfbox-app</artifactId>
            <version>2.0.22</version>
        </dependency>
        <dependency>
            <groupId>com.github.jai-imageio</groupId>
            <artifactId>jai-imageio-core</artifactId>
            <version>1.4.0</version>
        </dependency>
        <dependency>
            <groupId>com.github.jai-imageio</groupId>
            <artifactId>jai-imageio-jpeg2000</artifactId>
            <version>1.4.0</version>
        </dependency>
    </dependencies>