proxypyppeteer

pyppeteer-install behind proxy


I'm behind a corporate proxy.

I can get pip working by doing set https_proxy=http://myproxy:port so I can install pyppeteer

but whatever I've tried - I can't get pyppeteer to download chromium. I run pyppeteer-install, and it just says downloading chromium, but nothing ever gets put in the %appdata% pyppeteer location. is there any way to fix it, beyond downloading chromium manually and just putting it in the correct spot?


Solution

  • Based on pyppeteer/download_chromium but with the use of urllib3.ProxyManager instead of urllib3.PoolManager.

    from io import BytesIO
    import urllib3
    
    from tqdm import tqdm
    from pyppeteer import chromium_downloader  
    
    def download_zip(url: str) -> BytesIO:
            """Download data with proxy from url."""
            print('Starting Chromium download. Download may take a few minutes.')
        
            with urllib3.ProxyManager(proxy_url='http://proxy-ip:port') as http:
                # Get data from url.
                # set preload_content=False means using stream later.
                r = http.request('GET', url, preload_content=False)
                if r.status >= 400:
                    raise OSError(f'Chromium downloadable not found at {url}: Received {r.data.decode()}.\n')
        
                # 10 * 1024
                _data = BytesIO()
                try:
                    total_length = int(r.headers['content-length'])
                except (KeyError, ValueError, AttributeError):
                    total_length = 0
        
                process_bar = tqdm(total=total_length, unit_scale=True, unit='b')
                for chunk in r.stream(10240):
                    _data.write(chunk)
                    process_bar.update(len(chunk))
                process_bar.close()
        
            print('Chromium download done.')
            return _data
        
        
        def download_chromium() -> None:
            """Download and extract chromium."""
            chromium_downloader.extract_zip(download_zip(chromium_downloader.get_url()), chromium_downloader.DOWNLOADS_FOLDER / chromium_downloader.REVISION)
            print(f'Chrome executable path: {str(chromium_downloader.chromium_executable())}')
    

    Simply call download_chromium() in your program

    Rem: Don't forget to replace http://proxy-ip:port with your corporate proxy.