I'm trying to create a Synthetic monitor that will serve as a kind of Hearth Beat. The tricky thing is that I want it to check the authenticated pages. So, I have a script that logs-in using an alerting account and verify that the page is ok.
Following the documentation of Sythetic Monitors we see:
A synthetic monitor periodically executes a single-purpose 2nd gen Cloud Function that is deployed on Cloud Run. When you create the synthetic monitor, you define the Cloud Function, which must be written in Node.js ...
Then, the docs continues and gives some examples.
Taking the Puppeteer example, we see this:
const {
instantiateAutoInstrumentation,
runSyntheticHandler } = require('@google-cloud/synthetics-sdk-api');
// Run instantiateAutoInstrumentation before any other code runs, to get automatic logs and traces
instantiateAutoInstrumentation();
const functions = require('@google-cloud/functions-framework');
const axios = require('axios');
const assert = require('node:assert');
const puppeteer = require('puppeteer');
functions.http('CustomPuppeteerSynthetic', runSyntheticHandler(async ({logger, executionId}) => {
// Launch a headless Chrome browser and open a new page
const browser = await puppeteer.launch({ headless: 'new', timeout: 0});
const page = await browser.newPage();
// Navigate to the target URL
const result = await page.goto('https://www.example.com', {waitUntil: 'load'});
// Confirm successful navigation
await assert.equal(result.status(), 200);
// Print the page title to the console
const title = await page.title();
logger.info(`My Page title: ${title} ` + executionId);
// Close the browser
await browser.close();
}));
So, in theory, this should be enough to work the Puppeteer inside 2nd gen cloud functions. But, when testing this error happens:
PRODUCTION APP: Health check failed - Could not find Chromium (rev. 1108766). This can occur if either
1. you did not perform an installation before running the script (e.g. npm install) or
2. your cache path is incorrectly configured (which is: /root/.cache/puppeteer).
For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.
Checking Puppeteer docs, I see this:
The Node.js runtime of Google Cloud Functions comes with all system packages needed to run Headless Chrome.
To use puppeteer, specify the module as a dependency in your package.json and then override the puppeteer cache directory by including a file named .puppeteerrc.cjs at the root of your application with the contents:
const {join} = require('path');
/**
* @type {import("puppeteer").Configuration}
*/
module.exports = {
cacheDirectory: join(__dirname, 'node_modules', '.puppeteer_cache'),
};
[!NOTE] Google Cloud Functions caches your node_modules between builds. Specifying the puppeteer cache as subdirectory of node_modules mitigates an issue in which the puppeteer install process does not run when the cache is hit.
But I receive the same error, after adding the .puppeteerrc.cjs file.
But if we stop and think a bit, we remember that 2nd gen Cloud Functions are deployed on Cloud Run. So, following Puppeteer docs for Cloud Run, we see:
The default Node.js runtime of Google Cloud Run does not come with the system packages needed to run Headless Chrome. You will need to set up your own Dockerfile and include the missing dependencies.
So, according to Puppeteer docs for Linux we should create a Dockerfile and install the listed dependencies.
I did that, and here's my dockerfile as example:
# Use the official Node.js 20 image as a parent image
FROM node:20-slim
# Set working directory
WORKDIR /app
# Install necessary tools and libraries for Puppeteer and Chromium
RUN apt-get update && apt-get install -y \
ca-certificates \
fonts-liberation \
gnupg \
libasound2 \
libatk-bridge2.0-0 \
libatk1.0-0 \
libc6 \
libcairo2 \
libcups2 \
libdbus-1-3 \
libexpat1 \
libfontconfig1 \
libgbm1 \
libgcc1 \
libglib2.0-0 \
libgtk-3-0 \
libnspr4 \
libnss3 \
libpango-1.0-0 \
libpangocairo-1.0-0 \
libstdc++6 \
libx11-6 \
libx11-xcb1 \
libxcb1 \
libxcomposite1 \
libxcursor1 \
libxdamage1 \
libxext6 \
libxfixes3 \
libxi6 \
libxrandr2 \
libxrender1 \
libxss1 \
libxtst6 \
lsb-release \
procps \
wget \
xdg-utils \
--no-install-recommends \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Install Chromium
RUN apt-get update \
&& apt-get install -y chromium \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Copy package.json and package-lock.json (if available)
COPY package*.json ./
# Install npm dependencies
RUN npm install
# Copy the rest of the application's source code
COPY . .
# Run the app
CMD ["node", "index.js"]
But the error persists. I also tried to use other versions of Puppeteer without success. I also tried to check where Chromium is being installed:
...
const { exec } = require('child_process');
...
exec('chromium --version', (err, stdout, stderr) => {
if (err) {
// If an error occurs, log it (this could indicate Chromium is not installed or path is incorrect)
logger.error(`Error checking Chromium version: ${err.message}`);
return;
}
// Log the Chromium version to console
logger.info(`Chromium version: ${stdout}`);
});
so I could force Puppeter to use it this way:
const browser = await puppeteer.launch({
executablePath: '/usr/bin/chromium',
args: ['--no-sandbox', '--disable-setuid-sandbox'],
headless: 'new',
timeout: 0
});
But I get the error:
"Error checking Chromium version: Command failed: chromium --version\n/bin/sh: 1: chromium: not found"
I've tried many other things that I don't remember but nothing gets the Chromium installed. Any idea how to make it work?
I discovered the following about Synthetic Monitor and related functions.
With that in mind, here are the specifics of this use-case. To use Puppeteer in Cloud Functions, we need add the .puppeteerrc.cjs file as described in Puppeteer docs:
const { join } = require('path');
/**
* @type {import("puppeteer").Configuration}
*/
module.exports = {
cacheDirectory: join(__dirname, '.cache', 'puppeteer'),
};
We also need to add the magic script in package.json which is called during the function build. I believe the functions-framework is responsible for calling it.
"scripts": {
"gcp-build": "node node_modules/puppeteer/install.mjs"
}
So we can deploy a function that runs Puppeteer in Node20 with this package.json
{
"main": "index.js",
"scripts": {
"gcp-build": "node node_modules/puppeteer/install.mjs"
},
"dependencies": {
"@google-cloud/functions-framework": "^3.1.2",
"@google-cloud/synthetics-sdk-api": "^0.4.1",
"puppeteer": "^21.3.6"
},
}
Lastly and MOST IMPORTANT, at this moment (27/02/2024) the testing UI simply DOES NOT WORK due to the missing dependencies as described in the question. Every time I tried to run the function it failed complaining about a dependency that is missing.
But, if you just go forward and deploy the function, it just works as expected.