I'm trying to save each page of a given PDF document as an SVG file. Information on the topic seems scarce, in comparison to SVG to PDF which is quite popular.
Is there a simple way to do this using PDFBox, Batik or any combination of standard library features?
Here's some code that saves all pages of the PDF, I probably have it from here or from batik itself.
try (PDDocument pdfboxDocument = PDDocument.load(new File(dir, PDFFILE)))
{
PDFRenderer r = new PDFRenderer(pdfboxDocument);
for (int i = 0; i < pdfboxDocument.getNumberOfPages(); ++i)
{
String svgNS = "http://www.w3.org/2000/svg";
DOMImplementation impl = GenericDOMImplementation.getDOMImplementation();
Document myFactory = impl.createDocument(svgNS, "svg", null);
SVGGeneratorContext ctx = SVGGeneratorContext.createDefault(myFactory);
ctx.setEmbeddedFontsOn(true);
SVGGraphics2D g2d = new SVGGraphics2D(ctx, true);
r.renderPageToGraphics(i, g2d);
String filename = "test-" + (i + 1) + ".svg";
try (Writer out = new OutputStreamWriter(new FileOutputStream(new File(dir, filename)), "UTF-8"))
{
g2d.stream(out, true);
}
}
}
pom.xml excerpt:
<dependencies>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-svggen</artifactId>
<version>${batik.version}</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-codec</artifactId>
<version>${batik.version}</version>
</dependency>
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-dom</artifactId>
<version>${batik.version}</version>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox-app</artifactId>
<version>2.0.22</version>
</dependency>
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-core</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-jpeg2000</artifactId>
<version>1.4.0</version>
</dependency>
</dependencies>