Is there a way to make your Selenium script undetectable in Python using geckodriver?
I'm using Selenium for scraping. Are there any protections we need to use so websites can't detect Selenium?
The fact that selenium driven Firefox / GeckoDriver gets detected doesn't depends on any specific GeckoDriver or Firefox version. The Websites themselves can detect the network traffic and can identify the Browser Client i.e. Web Browser as WebDriver controled.
As per the documentation of the WebDriver Interface
in the latest editor's draft of WebDriver - W3C Living Document the webdriver-active
flag which is initially set as false, is set to true when the user agent is under remote control i.e. when controlled through Selenium.
Now that the NavigatorAutomationInformation
interface should not be exposed on WorkerNavigator
.
So,
webdriver
Returns true if webdriver-active flag is set, false otherwise.
where as,
navigator.webdriver
Defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, for example so that alternate code paths can be triggered during automation.
So, the bottom line is:
Selenium identifies itself
However some generic approaches to avoid getting detected while web-scraping are as follows:
time.sleep(secs)
. Here you can find a detailed discussion on How to sleep webdriver in python for milliseconds