web-scrapingcurlpython-requests

Why does the following curl works but equivalent python code does not?


The following curl is working:

curl --location 'https://www.lowes.com/rnr/r/get-by-product/50413062/pdp/prod?sortMethod=SubmissionTime&sortDirection=desc&offset=0' \
--header 'authority: www.lowes.com' \
--header 'accept: application/json, text/plain, */*' \
--header 'accept-language: en-US,en;q=0.9,hi;q=0.8,de;q=0.7,ur;q=0.6,pa;q=0.5,es;q=0.4' \
--header 'dnt: 1' \
--header 'referer: https://www.lowes.com/pd/KitchenAid-25-8-cu-ft-5-Door-French-Door-Refrigerator-with-Ice-Maker-Stainless-Steel/50413062' \
--header 'sec-ch-ua: "Google Chrome";v="109", "Chromium";v="109", "Not=A?Brand";v="99"' \
--header 'sec-ch-ua-mobile: ?0' \
--header 'sec-ch-ua-platform: "macOS"' \
--header 'sec-fetch-dest: empty' \
--header 'sec-fetch-mode: cors' \
--header 'sec-fetch-site: same-origin' \
--header 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36' \
--header 'Cookie: EPID=OGFhNmEzMWItMTI5Mi00OTk5LWE0MDMtNDE0M2FjZWVhMGYw; _abck=F42771C150B4C119EC8B175032CA9585~-1~YAAQl24/Fxke7uWGAQAA2GM8+wnNceF82+OD7MYSY54sfGjY6sI9gk/3TA4PXZJET2G+qbMoKQblP1E6vLRIN9mNrI7hUixTfWSLLY3M6mtGEAJVhM4QZInOEUYep02CB7YspAAj33IAsz1CjJ6CTVGMkBHaD3xcYTERPO/ciOfY+BBBDxr+3P5xaqF3UYfPCKt7hKF9Mma2Ov/Q3adCanBW3jFQ3pSfaUZJkoJwqLdcydDMT0X/OUfPEBD5o6DlFFeLGNISFMr9GWt3LEI1WgtkQ8JuCEfaPx6wB0OGXt3CT3rSv3FJWOpqQdDkiEld74IHsPhXkxrfD1ujjX6Mhvioab5tcf49CdmtqD9ew+XOjNZGg+fvIx4/qCDzlcTJAQIjOvdiMhowip38mm9gkgCbhh808pb5y3J+mQ==~-1~-1~-1; bm_sz=1A8431CC8D5EB5603F27267A256FB757~YAAQl24/Fxoe7uWGAQAA2GM8+xNq6E5oqP5QkZ1IFVKNETajEDxDQPsGYZPLcy71Ye8/DDeGwZlryGa+D3p18VEw6uIkOsFnLSUbVX50hwSaGaj+KMNvfjWBiRvNSsrk+OFSTKGWpB6e0cbbctbqDFEw1YJDCK5GdAw3ZKSNBxaUnRLQFoFlFpdzBe26MQo6CkIeMx9mq3oNnaajwjEwSyOzbGclasWViYeL0nzNgiKXvfD+MEFOvbtwU/+9ymXhAJUiLUPBQdyCF6aVWspjx9SNaEGR+iuu7jBUjnDcldwGnQ==~3355461~4539959; dbidv2=8aa6a31b-1292-4999-a403-4143aceea0f0; akaalb_prod_dual=1679338684~op=PROD_GCP_EAST_CTRL_DFLT:PROD_DEFAULT_CTRL|~rv=31~m=PROD_DEFAULT_CTRL:0|~os=352fb8a62db4e37e16b221fb4cefd635~id=ece72cac85177577cc3b22d07a88c063; region=central'

But the following python code which does the same thing is not working

import requests

url = "https://www.lowes.com/rnr/r/get-by-product/50413062/pdp/prod?sortMethod=SubmissionTime&sortDirection=desc&offset=0"

payload={}
headers = {
  'authority': 'www.lowes.com',
  'accept': 'application/json, text/plain, */*',
  'accept-language': 'en-US,en;q=0.9,hi;q=0.8,de;q=0.7,ur;q=0.6,pa;q=0.5,es;q=0.4',
  'dnt': '1',
  'referer': 'https://www.lowes.com/pd/KitchenAid-25-8-cu-ft-5-Door-French-Door-Refrigerator-with-Ice-Maker-Stainless-Steel/50413062',
  'sec-ch-ua': '"Google Chrome";v="109", "Chromium";v="109", "Not=A?Brand";v="99"',
  'sec-ch-ua-mobile': '?0',
  'sec-ch-ua-platform': '"macOS"',
  'sec-fetch-dest': 'empty',
  'sec-fetch-mode': 'cors',
  'sec-fetch-site': 'same-origin',
  'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36',
  'Cookie': 'EPID=OGFhNmEzMWItMTI5Mi00OTk5LWE0MDMtNDE0M2FjZWVhMGYw; _abck=F42771C150B4C119EC8B175032CA9585~-1~YAAQl24/Fxke7uWGAQAA2GM8+wnNceF82+OD7MYSY54sfGjY6sI9gk/3TA4PXZJET2G+qbMoKQblP1E6vLRIN9mNrI7hUixTfWSLLY3M6mtGEAJVhM4QZInOEUYep02CB7YspAAj33IAsz1CjJ6CTVGMkBHaD3xcYTERPO/ciOfY+BBBDxr+3P5xaqF3UYfPCKt7hKF9Mma2Ov/Q3adCanBW3jFQ3pSfaUZJkoJwqLdcydDMT0X/OUfPEBD5o6DlFFeLGNISFMr9GWt3LEI1WgtkQ8JuCEfaPx6wB0OGXt3CT3rSv3FJWOpqQdDkiEld74IHsPhXkxrfD1ujjX6Mhvioab5tcf49CdmtqD9ew+XOjNZGg+fvIx4/qCDzlcTJAQIjOvdiMhowip38mm9gkgCbhh808pb5y3J+mQ==~-1~-1~-1; bm_sz=1A8431CC8D5EB5603F27267A256FB757~YAAQl24/Fxoe7uWGAQAA2GM8+xNq6E5oqP5QkZ1IFVKNETajEDxDQPsGYZPLcy71Ye8/DDeGwZlryGa+D3p18VEw6uIkOsFnLSUbVX50hwSaGaj+KMNvfjWBiRvNSsrk+OFSTKGWpB6e0cbbctbqDFEw1YJDCK5GdAw3ZKSNBxaUnRLQFoFlFpdzBe26MQo6CkIeMx9mq3oNnaajwjEwSyOzbGclasWViYeL0nzNgiKXvfD+MEFOvbtwU/+9ymXhAJUiLUPBQdyCF6aVWspjx9SNaEGR+iuu7jBUjnDcldwGnQ==~3355461~4539959; dbidv2=8aa6a31b-1292-4999-a403-4143aceea0f0; akaalb_prod_dual=1679338684~op=PROD_GCP_EAST_CTRL_DFLT:PROD_DEFAULT_CTRL|~rv=31~m=PROD_DEFAULT_CTRL:0|~os=352fb8a62db4e37e16b221fb4cefd635~id=ece72cac85177577cc3b22d07a88c063; region=central'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

I am getting a valid json as response while excuting the curl. But getting blocked while using the python code. The response of the python code is the following.

'<HTML><HEAD>\n<TITLE>Access Denied</TITLE>\n</HEAD><BODY>\n<H1>Access Denied</H1>\n \nYou don\'t have permission to access "http&#58;&#47;&#47;www&#46;lowes&#46;com&#47;rnr&#47;r&#47;get&#45;by&#45;product&#47;5013699701&#47;pdp&#47;prod&#63;" on this server.<P>\nReference&#32;&#35;18&#46;b56e3f17&#46;1679223477&#46;1d4c5ef7\n</BODY>\n</HTML>\n'

what can I do in python code to get it working.

I more thing I want to add. The above curl will stop working if chrome version is changed to 110 or 108 in user-agents.

HELP! Thanks!


Solution

  • I was able to fix this problem myself. As it turns out lowes is using ssl version TLS 1.3 (Check yourself on https://www.ssllabs.com/ssltest/analyze.html?d=www.lowes.com)

    So I made python requests library to use TLS 1.3 and then I got the valid response.

    Used the following code to change the ssl version in requests.

    import requests
    import ssl
    from requests.adapters import HTTPAdapter
    from urllib3.poolmanager import PoolManager
    from urllib3.util.ssl_ import create_urllib3_context
    
    class CustomSSLAdapter(HTTPAdapter):
        CIPHERS = (
            'TLS_AES_128_GCM_SHA256:'
            'TLS_AES_256_GCM_SHA384:'
            'TLS_CHACHA20_POLY1305_SHA256:'
            'ECDHE-RSA-AES128-GCM-SHA256:'
            'ECDHE-RSA-AES256-GCM-SHA384:'
            'ECDHE-RSA-CHACHA20-POLY1305:'
        )
    
        def create_ssl_context(self):
            ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
            ctx.set_ciphers(self.CIPHERS)
            ctx.options |= ssl.OP_NO_TLSv1 | ssl.OP_NO_TLSv1_1 | ssl.OP_NO_TLSv1_2
            return ctx
    
        def init_poolmanager(self, connections, maxsize, block=False):
            self.poolmanager = PoolManager(
                num_pools=connections,
                maxsize=maxsize,
                block=block,
                ssl_version=ssl.PROTOCOL_TLS_CLIENT,
                ssl_context=self.create_ssl_context(),
            )
    
    url = 'https://www.lowes.com/rnr/r/get-by-product/50413062/pdp/prod?sortMethod=SubmissionTime&sortDirection=desc&offset=0'
    headers = {
        'accept': 'application/json, text/plain, */*',
        'accept-language': 'en-US,en;q=0.9,hi;q=0.8,de;q=0.7,ur;q=0.6,pa;q=0.5,es;q=0.4',
        'dnt': '1',
        'referer': 'https://www.lowes.com/pd/KitchenAid-25-8-cu-ft-5-Door-French-Door-Refrigerator-with-Ice-Maker-Stainless-Steel/50413062',
        'sec-ch-ua': '"Google Chrome";v="109", "Chromium";v="109", "Not=A?Brand";v="99"',
        'sec-ch-ua-mobile': '?0',
        'sec-ch-ua-platform': '"macOS"',
        'sec-fetch-dest': 'empty',
        'sec-fetch-mode': 'cors',
        'sec-fetch-site': 'same-origin',
        'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'
    }
    
    session = requests.Session()
    session.mount('https://', CustomSSLAdapter())
    
    response = session.get(url, headers=headers)
    
    print(response.text)