How can I bypass the Google CAPTCHA using Selenium and Python?
When I try to scrape something, Google give me a CAPTCHA. Can I bypass the Google CAPTCHA with Selenium Python?
As an example, it's Google reCAPTCHA. You can see this CAPTCHA via this link: https://www.google.com/recaptcha/api2/demo
To start with using Selenium's Python clients, you should avoid solving/bypass Google CAPTCHA.
Selenium automates browsers. Now, what you want to achieve with that power is entirely up to individuals, but primarily it is for automating web applications through browser clients for testing purposes and of coarse it is certainly not limited to that.
On the other hand, CAPTCHA (the acronym being ...Completely Automated Public Turing test to tell Computers and Humans Apart...) is a type of challenge–response test used in computing to determine if the user is human.
So, Selenium and CAPTCHA serves two completely different purposes and ideally shouldn't be used to achieve any interrelated tasks.
Having said that, reCAPTCHA can easily detect the network traffic and identify your program as a Selenium driven bot.
However, there are some generic approaches to avoid getting detected while web scraping:
time.sleep(secs)
. Here you can find a detailed discussion on How to sleep Selenium WebDriver in Python for millisecondsHowever, in a couple of use cases we were able to interact with the reCAPTCHA using Selenium and you can find more details in the following discussions:
You can find a couple of related discussion in: