pythonloopsweb-scrapingpostman

Looping in a url or scrape data from variation in Url


My goal is to have all latitudes and longitude range for canada being automatically inputted into the code below and it scraping the locations that come up automatically. I know canada range is latitudes of 42°N to 83°N and longitude of 53°W to 141°W. I understand how to scrape this type of data but never had to loop information within a url.I have a fear I will somehow make a loop that does nothing but get me ban from the website. So any help would be great!

import requests

url = "https://www.circlek.com/stores_new.php?lat=43.6529&lng=-79.3849&services=&region=global"

payload={}
headers = {
  'Connection': 'keep-alive',
  'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
  'Accept': '*/*',
  'X-Requested-With': 'XMLHttpRequest',
  'sec-ch-ua-mobile': '?0',
  'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36',
  'Sec-Fetch-Site': 'same-origin',
  'Sec-Fetch-Mode': 'cors',
  'Sec-Fetch-Dest': 'empty',
  'Referer': 'https://www.circlek.com/store-locator?Canada&lat=43.6529&lng=-79.3849',
  'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
  'dnt': '1'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

Solution

  • As you commented you can put your code like this , i am guessing your different latitude and longitude store in list like this if not share the range of lat_lng with difference

    lat_lng = [(lat,long) for lat,long in zip(range(43,83),range(-141,-53))] #store or create range of latitude and longitude 
    
    for latitude,longitude in lat_lng:
      url = f"https://www.circlek.com/stores_new.php?lat={latitude}&lng={longitude}&services=&region=global"
      payload={}
      headers = {
        'Connection': 'keep-alive',
        'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
        'Accept': '*/*',
        'X-Requested-With': 'XMLHttpRequest',
        'sec-ch-ua-mobile': '?0',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36',
        'Sec-Fetch-Site': 'same-origin',
        'Sec-Fetch-Mode': 'cors',
        'Sec-Fetch-Dest': 'empty',
        'Referer': 'https://www.circlek.com/store-locator?Canada&lat=43.6529&lng=-79.3849',
        'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
        'dnt': '1'
      }
    
      response = requests.request("GET", url, headers=headers, data=payload)
    
      print(response.json())
    

    you wrap around in function also .

    as you commented , for negative arrange range should be like this , it is working

    lat_lng = [(lat,long) for lat,long in zip(range(43,83),range(-141,-53))]
    
    #[(43, -141), (44, -140), (45, -139), (46, -138), (47, -137), (48, -136),.....]
    

    In above output to have notice that in zip we have one to one like one latitude point to one longitude but if you want one to many see itertools module it will help.

    for more accurate use i will suggest see np.arange you can use for float also like

    np.arange(43,83,0.001)
    #array([43.   , 43.001, 43.002, ..., 82.997, 82.998, 82.999])