pythonseleniumweb-scrapingproxyrequest-timed-out

Python / Selenium, if 'This site can’t be reached' then do something else


I have a script that tests some proxies and not working proxies will show the page "This site can’t be reached", and If this occurs I want the code to continue and try again with a different proxy, but selenium stays stuck on the "This site can’t be reached" page.

I tried a try/exept clause but selenium still stays on the error page, I also tried except TimeoutError, but still stuck.

Code:

from selenium import webdriver
from selenium.webdriver.common.proxy import Proxy, ProxyType
import time

proxy_ip_port = 'ip:port'

proxy = Proxy()
proxy.proxy_type = ProxyType.MANUAL
proxy.http_proxy = proxy_ip_port
proxy.ssl_proxy = proxy_ip_port

capabilities = webdriver.DesiredCapabilities.CHROME
proxy.add_to_capabilities(capabilities)

# replace 'your_absolute_path' with your chrome binary absolute path
driver = webdriver.Chrome('C:/chromedriver.exe', desired_capabilities=capabilities)

try:
    driver.get('https://www.tiktok.com/signup/phone-or-email/email')
except TimeoutError:
    print("Proxy: ", proxy_ip_port, "invalid")

Output:

C:\Users\fkahd\PycharmProjects\tkinter\venv\Scripts\python.exe C:/Users/fkahd/PycharmProjects/tkinter/proxy2.py
C:\Users\fkahd\PycharmProjects\tkinter\proxy2.py:17: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
  driver = webdriver.Chrome('C:/chromedriver.exe', desired_capabilities=capabilities)

Process finished with exit code 0

"This site can’t be reached" page:

enter image description here

Thanks for your help!


Solution

  • you can specify the page load timeout value, and catch the TimeoutException type.

    I was able to catch a failure with the generalized exception handling (I entered a loopback proxy for completeness...):

    from selenium import webdriver
    from selenium.webdriver.common.proxy import Proxy, ProxyType
    from selenium.common.exceptions import TimeoutException
    
    
    
    proxy_ip_port = '10.27.0.0:80'
    
    proxy = Proxy()
    proxy.proxy_type = ProxyType.MANUAL
    proxy.http_proxy = proxy_ip_port
    proxy.ssl_proxy = proxy_ip_port
    
    capabilities = webdriver.DesiredCapabilities.CHROME
    proxy.add_to_capabilities(capabilities)
    
    # replace 'your_absolute_path' with your chrome binary absolute path
    driver = webdriver.Chrome()
    driver.set_page_load_timeout(5)
    try:
        driver.get('https://www.tiktok.com/signup/phone-or-email/email')
    except TimeoutException:
        print("Proxy: ", proxy_ip_port, "invalid")