[SOLVED] How to crawl websites without getting blocked?

How to crawl websites without getting blocked?

I crawl websites very often at the rate of hundreds of requests in an hour.

How to make crawlers behavior more like a human?
How to not get on radar by detection bots?

Currently crawling site with selenium, chrome.

Kindly suggest.

Solution

Well, you will have to pause the script between loops.

import time
time.sleep(1)
time.sleep(N)

So, it could hypothetically work like this.

import json,urllib.request
import requests
import pandas as pd
from string import ascii_lowercase
import time

alldata = []
for c in ascii_lowercase:
    response = requests.get('https://reservia.viarail.ca/GetStations.aspx?q=' + c)
    json_data = response.text.encode('utf-8', 'ignore') 
    df = pd.DataFrame(json.loads(json_data), columns=['sc', 'sn', 'pv'])  # etc., 
    time.sleep(3)
    alldata.append(df)

Or, look for an API to grab data from the URL you are targeting. You didn't post an actual URL, so it's impossible to say for sure if an API is exposed or not.