I'm using Puppeteer and JS to write a web scraper. The site I'm scraping is pretty intense, so I need to use a local chrome instance and a residential proxy service to get it working. Here's my basic setup.
const chromeProcess = spawn(chromePath, [
`--remote-debugging-port=${PORT}`,
`--user-data-dir=${userDataDir}`,
`--proxy-server=${proxyUrl}`,
"--no-first-run",
"--no-default-browser-check",
"--disable-extensions",
"--start-maximized",
"--disable-features=IsolateOrigins,site-per-process"
], { stdio: "ignore" });
let browser = await puppeteer.connect({ browserURL: `http://127.0.0.1:${PORT}` });
let page = await browser.newPage();
I've been getting a multitude of errors trying to get the proxy service working, however, (like net::ERR_NO_SUPPORTED_PROXIES
) where the page won't load, or will show a "page not found" error in the browser. I've tried tunneling with mitmproxy
with no luck, so I'm just not sure what's possible at this point.
Does anyone have any insight into using proxies with a local chrome instance? Is this even possible?
One possible solution I've found using OxyLabs is to just whitelist your IP rather than use a user/pass to authenticate. You can also setup a system-wide proxy if you're on a windows machine to bypass having to authenticate with chrome directly.