web-scrapingpuppeteergoogle-chrome-headlessheadless-browser

Puppeteer Uncatchable Target Closed Error


I am having problems with using puppeteer. Every 10 to 50 requests I get the following error:

TargetCloseError: Protocol error (Network.getCookies): Target closed
at CallbackRegistry.clear (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Connection.js:153:36)
at CDPSessionImpl._onClosed (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Connection.js:468:70)
at Connection.onMessage (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Connection.js:265:25)
at WebSocket.<anonymous> (/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/NodeWebSocketTransport.js:62:32)
at callListener (/app/node_modules/ws/lib/event-target.js:290:14)
at WebSocket.onMessage (/app/node_modules/ws/lib/event-target.js:209:9)
at WebSocket.emit (node:events:513:28)
at Receiver.receiverOnMessage (/app/node_modules/ws/lib/websocket.js:1184:20)
at Receiver.emit (node:events:513:28)
at Receiver.dataMessage (/app/node_modules/ws/lib/receiver.js:541:14)
at Receiver.getData (/app/node_modules/ws/lib/receiver.js:459:17)
at Receiver.startLoop (/app/node_modules/ws/lib/receiver.js:158:22)
at Receiver._write (/app/node_modules/ws/lib/receiver.js:84:10)
at writeOrBuffer (node:internal/streams/writable:392:12)
at _write (node:internal/streams/writable:333:10)
at Writable.write (node:internal/streams/writable:337:10) 

And the biggest problem with it is the fact that it is not catchable and therefore causes my whole app to crash.

I use await before every puppeteer-action and here is all of my code that involves puppeteer:

let browser = null;
try {
    browser = await puppeteer.launch({
        headless: "new",
        args: [
            `--user-agent=${userAgent}`,
            '--no-sandbox'
        ]
    });
    browser.on('disconnected', () => browserDisconnected = true)
    const page = await browser.newPage();

    await useProxy(page, proxyUrl);
    await page.goto(url);
    await setTimeout(pageLoadTimeout);
    await page.waitForSelector("#Title");

    const renderedContent = await page.content();
    const $ = cheerio.load(renderedContent);

    const title = $("#Title").html();

    output = { title: title };
} catch (error) {
    console.log(error);
    output =  { error: true, error: error.message };
} finally {
    try {
        if(browser !== null) {
            await browser.close();
        }
    } catch(error) {
        console.log("ERROR CLOSING BROWSER:");
        console.log(error);
    }
    return output;
}   

I've stripped the code from any tasks that don't involve puppeteer.

I am absolutely clueless on how to handle this error and I've pretty much seen any SO, Github, ... issue on this topic. It would already be helpful if I could catch the error so it doesn't cause my whole server to crash and have to restart.

Where am I running the server? In a docker container with the following platform --platform=linux/amd64 node:18 in Google App Engine.

Any help would be very appreciated!


Solution

  • Thanks to a hint in the right direction by @Yaroslavm, I've now found the cause and solution for this particular problem.

    The issue was with await useProxy(page, proxyUrl); from the puppeteer-page-proxy'package. The error occurred internally within the package.

    Luckily there is another way to use a proxy with Puppeteer.

    I've found the following solution (https://pixeljets.com/blog/how-to-set-proxy-in-puppeteer/):

    const oldProxyUrl = `http://${proxy.username}:${proxy.password}@${proxy.address}:${proxy.port}`;
     const newProxyUrl = await proxyChain.anonymizeProxy(oldProxyUrl);
    
    browser = await puppeteer.launch({
       headless: "true",
       args: [
         `--user-agent=${userAgent}`,
         '--no-sandbox',
         `--proxy-server=${newProxyUrl}`
       ]
    });
    
    const page = await browser.newPage();
    

    Done!

    The link I provided above mentions more options which I have yet to test. I hope I could help!