I have made a Proxy Checker in python in combination with selenium so everytime its opening the selenium browser it uses a different proxy.. But not all the proxies work and I'm stuck with loading the page forever if the proxy is slow.. So strings as a key don't work because the page doesn't get loaded. Is there a function in Python to let me do something like when the page is not fully loaded in 10 seconds it should go to the next proxy? Thanks in advance!
My code so far:
# PROXY SETUP FOR THIS PROGRAM
def fly_setup(fly_url):
fly_options = webdriver.ChromeOptions()
fly_options.add_experimental_option("prefs", {
"profile.default_content_setting_values.notifications": 1
})
with open("proxies.txt") as fly_proxies:
lines = fly_proxies.readlines()
counter = 0
for proxy in lines:
fly_options.add_argument('--proxy-server=%s' % proxy.rstrip())
ad_chrome = webdriver.Chrome(options=fly_options)
ad_chrome.get(fly_url)
ad_source = ad_chrome.page_source
key = 'Vind ik leuk'
time.sleep(10)
if ad_chrome.set_page_load_timeout(10):
print("Page load took to long.. Going to next proxy ")
else:
if key not in ad_source:
print("Proxy not working! Going to next one ...")
ad_chrome.quit()
time.sleep(3)
else:
time.sleep(10)
ad_chrome.find_element_by_xpath('//*[@id="skip_bu2tton"]').click()
counter += 1
print("Total views : " + str(counter))
print("")
ad_chrome.quit()
time.sleep(3)
You can set a timeout limit using set_page_load_timeout like
driver.set_page_load_timeout(10)
If the page cannot be loaded within 10 seconds, then it will throw TimeoutException
doc here, catch it and then switch to your next proxy.
In your code, if I assume lines
contains all proxies, you can do something like this:
for proxy in lines:
fly_options.add_argument('--proxy-server=%s' % proxy.rstrip())
ad_chrome = webdriver.Chrome(options=fly_options)
ad_chrome.set_page_load_timeout(10)
try:
ad_chrome.get(fly_url)
except TimeoutException:
continue
This solution doesn't always work, especially when the page loads data using AJAX calls. In this case, bet on selenium's waits, wait for something that is only presented/clickable when the whole page finishes loading, then same idea, catch TimeoutException
and continue your loop.