[SOLVED] How to use Puppeteer to download PDF files from a website?

How to use Puppeteer to download PDF files from a website?

I've been trying to use Puppeteer to download PDF files from a specific website but how do I get it to download all the files for example:

A file on the website is like example.com/Contents/xxx-1.pdf A second file on the website is like example.com/Contents/xxx-2.pdf

How can I use puppeteer to download the file contents automatically by trying for each number added?

Solution

I've made a function that given a function with an index as parameter, returns the url of the pdf to download and a count that limits the downloads, it tries to download the pdf.

const puppeteer = require('puppeteer');


downloadFiles((i) => `example.com/Contents/xxx-${i}.pdf`, 20);

async function downloadFiles(url, count) {
    const browser = await puppeteer.launch({
        headless: false,
        args: ['--no-sandbox', '--disable-setuid-sandbox']
    });
    const page = await browser.newPage();
    for (let i = 0; i < count; i++) {
        const pageUrl = await url(i);
        try {
            await page.goto(pageUrl);
            await page.pdf({
                path: `pdf-${i}.pdf`,
                format: 'A4',
                printBackground: true
            });
        } catch (e) {
            console.log(`Error loading ${pageUrl}`);
        }
    }
    await browser.close();
}