
How do I avoid imperva bot detection?

I am running a Python script that scrapes a website. It uses Imperva to detect automated scripts crawling through it's web pages. Imperva has blocked my IP from accessing the site as soon as I run the script. I did read someone suggest including a time.sleep(random.randint(a,b)) (to try and mimic human behaviour) in the script which it didn't work or perhaps it just wouldn't work as a standalone method. If it's the chrome driver itself that they detect then I guess it would be impossible to avoid. Does anyone have any practical suggestions on things that I could include in my script to bypass this?. Thanks in advance.


  • Introduction

    There are many different components that need to be added to a web scraper to make it undetectable. I recommend using the below code to test your current level of detection:


    More than likely, you will fail most of those tests right off the bat, fortunately, it's easy to configure a scraper that will pass all of those tests and be completely undetectable.


    selenium-stealth is a python package that is used to avoid detection. Simply...

    pip install selenium-stealth

    and follow the below configuration:

            user_agent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36(KHTML, like Gecko) Chrome/83.0.4103.53 Safari/537.36',
            languages=["en-US", "en"],
            vendor="Google Inc.",
            webgl_vendor="Intel Inc.",
            renderer="Intel Iris OpenGL Engine",

    Your web scraper should pass all of the tests, now try to implement this solution on the Imperva site.

    More information

    If you are still getting blocked, I recommend looking into the random-user-agent library to cycle your user agent within the "user_agent" variable of the selenium-stealth configuration. Otherwise, you could pay for a proxy provider to completely disguise your IP. Although keep in mind, proxy networks currently do not have a selenium configuration.

    Information on Proxy Network Selenium Configuration: Python Selenium Proxy Network

    Information on Selenium Detectability in the Cloud: Python Selenium AWS Lambda Change WebGL Vendor/Renderer For Undetectable Headless Scraper