pythonsslpython-requestscertificateburp

Python requests https: code 403 without but code 200 when using BurpSuite


I'm currently trying to scrape retailmenot.com this is how my code looks so far:

import requests
from collections import OrderedDict

s = requests.session()

s.headers = OrderedDict()
s.headers["Connection"] = "close"
s.headers["Upgrade-Insecure-Requests"] = "1"
s.headers["User-Agent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36"
s.headers["Accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
s.headers["Sec-Fetch-Site"] = "none"
s.headers["Sec-Fetch-Mode"] = "navigate"
s.headers["Sec-Fetch-Dest"] = "document"
s.headers["Accept-Encoding"] = "gzip, deflate"
s.headers["Accept-Language"] = "en-GB,en-US;q=0.9,en;q=0.8"

s.get("https://www.retailmenot.com/sitemap/A")

When I use this code I instantly get redirected to a CloudFlare page. That said whenever I pass my traffic through burpsuite by replacing the last line of my code with this one:

s.get("https://www.retailmenot.com/sitemap/A", proxies = {"https":"https://127.0.0.1:8080"}, verify ="/Users/Downloads/cacert (1).pem")

I get straight to the website. I find this a bit strange and was wondering If anyone could possibly explain to me why this is happing and if there's a way to get similar results by using some different certificate (As in order to use the BurpSuite Certificate I need to keep the app open). Many thanks in advance!


Solution

  • It looks the problem is the underlying client side TLS behavior.

    I have an older version of Python using OpenSSL 1.1.1b and a newer one using OpenSSL 1.1.1f. It fails with the first version but works with the second version. This would also explain why it works with Burp: it uses a slightly different TLS behavior.

    I've tried to track the problem down: Making the non-working version use the ciphers of the working version will not help. The main other difference are the supported signature algorithms. And actually with the mentioned openssl 1.1.1b (but also with newer versions shipped with Anaconda Python) the difference can be reduced to sigalgs:

     $ openssl s_client -connect www.retailmenot.com:443 -crlf
     ...[various output]...
     <paste the expected HTTP request>
     ...
     HTTP/1.1 403 Forbidden
    
     $ openssl s_client -connect www.retailmenot.com:443 -crlf -sigalgs 'ECDSA+SHA256'
     ...[various output]...
     <paste the expected HTTP request>
     ...
     HTTP/1.1 200 OK
    

    Unfortunately I can see no way in Python requests to directly set the signature algorithms in the TLS stack. The API is not exposed and it simply uses the default - and thus fails or succeeds depending on how OpenSSL was built.

    But it looks like it is possible to indirectly set the value by specifying a different security level:

    from requests.adapters import HTTPAdapter
    from requests.packages.urllib3.util.ssl_ import create_urllib3_context
    
    CIPHERS = ('DEFAULT:@SECLEVEL=2')
    class CipherAdapter(HTTPAdapter):
        def init_poolmanager(self, *args, **kwargs):
            context = create_urllib3_context(ciphers=CIPHERS)
            kwargs['ssl_context'] = context
            return super(CipherAdapter, self).init_poolmanager(*args, **kwargs)
    
        def proxy_manager_for(self, *args, **kwargs):
            context = create_urllib3_context(ciphers=CIPHERS)
            kwargs['ssl_context'] = context
            return super(CipherAdapter, self).proxy_manager_for(*args, **kwargs)
    
    s = requests.session()
    s.mount('https://www.retailmenot.com/', CipherAdapter())
    ...
    print(s.get("https://www.retailmenot.com/sitemap/A"))
    

    This, together with the specific header settings, results in my tests in <Response [200]> whereas with the same Python version and without the changed security level it results in <Response [403]>.