javapdfbox

PDFBox 3 render page to image from large pdf file IllegalArgumentException: capacity < 0


I'm using PDFBox version 3.0.3 to render the page to image from a huge pdf file (from 500M to 1GB).

This is the code that I use to render a page to image.

PDDocument document = Loader.loadPDF(pdfFile, IOUtils.createTempFileOnlyStreamCache());
PDFRenderer pdfRenderer = new PDFRenderer(document);
pdfRenderer.setSubsamplingAllowed(true);
BufferedImage image = pdfRenderer.renderImage(0, scale, ImageType.RGB);
ImageIO.write(image, "png", imageFile);

I was debugging and noticed that the issue from method decode of Filter. When the length around 524,288,000 to 1,048,576,000, value input to RandomAccessReadWriteBuffer will be a negative number.

randomAccessWriteBuffer = new RandomAccessReadWriteBuffer(
    Math.min(length << 2, RandomAccessReadBuffer.DEFAULT_CHUNK_SIZE_4KB));
o.a.p.contentstream.PDFStreamEngine - java.lang.IllegalArgumentException: capacity < 0: (-75475220 < 0)
java.io.IOException: java.lang.IllegalArgumentException: capacity < 0: (-75475220 < 0)
    at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:223)
    at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:477)
    at org.apache.pdfbox.rendering.PageDrawer.drawImage(PageDrawer.java:1103)
    at org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:74)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:893)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:531)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:506)
    at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:153)
    at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:286)
    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:330)
    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:247)
Caused by: java.lang.IllegalArgumentException: capacity < 0: (-75475220 < 0)
    at java.base/java.nio.Buffer.createCapacityException(Buffer.java:290)
    at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:390)
    at org.apache.pdfbox.io.RandomAccessReadBuffer.<init>(RandomAccessReadBuffer.java:70)
    at org.apache.pdfbox.io.RandomAccessReadWriteBuffer.<init>(RandomAccessReadWriteBuffer.java:40)
    at org.apache.pdfbox.filter.Filter.decode(Filter.java:250)
    at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:73)
    at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:172)
    at org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:193)
    at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream(PDImageXObject.java:895)
    at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit(SampledImageReader.java:469)
    at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:217)

Is this a bug from Library? Can we tackle this by anyway? Thanks


Solution

  • This has been fixed in PDFBOX-5908 and will be in PDFBox 3.0.4. A snapshot build is available here, please test it just to be sure. Thank you for reporting this.