javapdfapache-commons-io

Apache Commons IO download only first PDF page


I'm using Java with Apache Commons-IO to download a PDF but I only want to get the first page, is there a way I can do it?

Here's the piece of code that gets the whole doc:

public void getPDF(String route) throws IOException {
    URL url = new URL(route);
    File file = new File("file.pdf");
    FileUtils.copyURLToFile(url, file);
}

Solution

  • In continuation to your code, you may use a new Document to hold only first page of given PDF file.

     URL url = new URL(route);
     File file = new File("file.pdf");
     FileUtils.copyURLToFile(url, file);
    
     PDDocument pdDoc = PDDocument.load(file);
     PDDocument document = null;
    
    int pageNumberToRead=0;
    
    try {   
        document = new PDDocument();   
        document.addPage((PDPage) pdDoc.getDocumentCatalog().getAllPages().get(pageNumberToRead));   
        document.save("basepath/first_page.pdf");  
        document.close();  
    }catch(Exception e){}