puppeteernetlify-function

"Could not find Chromium" when using puppeteer in a Netlify lambda function


Note: this app runs okay in the dev environment; it's only when deploying do I see this error. This is a recent error, not seen before.

The original full error text was:

ERROR! Error: Could not find Chromium (rev. 1108766). This can occur if either 1. you did not perform an installation before running the script (e.g. npm install) or 2. your cache path is incorrectly configured (which is: /home/sbx_user1051/.cache/puppeteer). For (2), check out our guide on configuring puppeteer at Configuration | Puppeteer.

A dump of __dirname and cwd in the function call showed:

 __dirname: '/var/task/netlify/functions/pgnfen',
  cwd: '/var/task'

So I added the environment variable PUPPETEER_CACHE_DIR, added a .cache/puppeteer folder to the lambda function folder, and the error is still there, but with the "correct" folder:

Could not find Chromium (rev. 1108766). This can occur if either 1. you did not perform an installation before running the script (e.g. npm install) or 2. your cache path is incorrectly configured (which is: /var/task/netlify/functions/pgnfen/.cache/puppeteer). For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.

BTWI also turned off caching:

  const browser = await puppeteer.launch()
  const page = await browser.newPage()

  await page.setCacheEnabled(false)

So now I am thoroughly stuck.


Solution

  • A little bit late here, but I was trying to do something similar, and got it working using @Sparticuz/chromium and puppeteer-core.

    The docs show how to migrate from chrome-aws-lambda, which does not run on Netlify anymore.


    So here's a working example, which takes a ?url=https://example.com param, loads the content in headless chromium, takes a screenshot, and returns it as base64 encoded data. I deployed it to Netlify, and seems to work great.

    const puppeteer = require('puppeteer-core');
    const chromium = require('@sparticuz/chromium');
    
    exports.handler = async (event, context, callback) => {
      let browser = null;
      let targetUrl = event.queryStringParameters.url;
    
      if (!targetUrl) {
        callback(null, {
          statusCode: 400,
          body: JSON.stringify({ error: 'URL is missing from queryStringParameters' }),
        });
        return;
      }
    
      if (!targetUrl.startsWith('http://') && !targetUrl.startsWith('https://')) {
        targetUrl = 'http://' + targetUrl;
      }
    
      try {
        new URL(targetUrl);
      } catch (error) {
        callback(null, {
          statusCode: 400,
          body: JSON.stringify({ error: 'URL provided is invalid' }),
        });
        return;
      }
    
      try {
        browser = await puppeteer.launch({
          args: chromium.args,
          defaultViewport: { width: 800, height: 600 },
          executablePath: await chromium.executablePath(),
          headless: chromium.headless,
          ignoreHTTPSErrors: true,
        });
    
        let page = await browser.newPage();
    
        // Emulate dark theme
        await page.emulateMediaFeatures([{ name: 'prefers-color-scheme', value: 'dark' }]);
    
        // Set navigation timeout
        page.setDefaultNavigationTimeout(8000);
    
        await page.goto(targetUrl, { waitUntil: 'domcontentloaded' });
    
        // Ensure the page has some minimal interactivity before taking the screenshot.
        await page.evaluate(() => {
          const selector = 'body';
          return new Promise((resolve, reject) => {
            const element = document.querySelector(selector);
            if (!element) {
              reject(new Error(`Error: No element found with selector: ${selector}`));
            }
            resolve();
          });
        });
    
        const screenshotBuffer = await page.screenshot();
    
        const base64Screenshot = screenshotBuffer.toString('base64');
    
        const response = {
          statusCode: 200,
          body: JSON.stringify({ image: base64Screenshot }),
        };
    
        callback(null, response);
      } catch (error) {
        callback(null, {
          statusCode: 500,
          body: JSON.stringify({ error: `An error occurred: ${error.message}` }),
        });
      } finally {
        if (browser !== null) {
          await browser.close();
        }
      }
    };
    

    Only thing to note, is when I was running this locally (with netlify dev), it was unable to find the path to my Chromium executable, so as a temporary work-around, I switched the executablePath: '/usr/bin/chromium' (for Arch Linux) during dev - I'm sure there's better way though.

    I also couldn't get it to respond with the normal resolve methods, which is why I defaulted back to the older callback approach using the third param in the handler. Again, I'm sure there's a better way to do this.