pythonhtmlweb-scrapingpostpython-requests

Imitate post request triggered by button on website in python requests


I'm attempting to imitate the post request being triggered by the 'Download as CSV' button the webpage https://www.bundesanzeiger.de/pub/en/nlp?2. However I'm consistently returning an error webpage rather than the csv file.

Inspecting the html for the button gives the following:

<form method="post" action="https://www.bundesanzeiger.de/pub/en/nlp?2-2.-top~csv~form~panel-form" id="idf2"><div id="idf2_hf_0" hidden="" class="hidden-fields"></div>

<a class="btn btn-green argus-A98" target="_blank" href="https://www.bundesanzeiger.de/pub/en/nlp?2--top~csv~form~panel-form-csv~resource~link" title="Download as CSV"><span>Download as CSV</span>&nbsp;&nbsp;<i class="fas fa-file-csv"></i></a>
</form>

I had interpreted this as the html request being sent to the 'action' url with the payload id='idf2', and wrote the following snipet:

import requests

url = 'https://www.bundesanzeiger.de/pub/en/nlp?2--top~csv~form~panel-form-csv~resource~link'
payload = {'id': 'idf2'}

r = requests.post(url, json=payload)
r.content

However this just returns a webpage saying that there's been an error. Is there something I'm missing from my request? I also tried supplying a 'User-Agent' in the headers, but this doesn't seem to work either.


Solution

  • Try to first open the main page and then add --top~csv~form~panel-form-csv~resource~link to the URL:

    import requests
    
    
    url = "https://www.bundesanzeiger.de/pub/en/nlp?0--top~csv~form~panel-form-csv~resource~link"
    
    headers = {
        "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0",
        "Referer": "https://www.bundesanzeiger.de/",
    }
    
    with requests.session() as s:
        s.headers.update(headers)
        s.get("https://www.bundesanzeiger.de/pub/en/nlp?0")
    
        with open("out.csv", "wb") as f_out:
            f_out.write(s.get(url).content)
    

    Creates out.csv:

    "Positionsinhaber","Emittent","ISIN","Position","Datum"
    "Qube Research & Technologies Limited","Rheinmetall Aktiengesellschaft","DE0007030009","0,68","2023-10-19"
    "Caius Capital LLP","Deutsche Pfandbriefbank AG","DE0008019001","2,92","2023-10-19"
    "Millennium International Management LP","Deutsche Pfandbriefbank AG","DE0008019001","0,69","2023-10-19"
    "Marshall Wace LLP","TAG Immobilien AG","DE0008303504","0,50","2023-10-19"
    "SIH Partners, LLLP","Northern Data AG","DE000A0SMU87","1,06","2023-10-19"
    "Qube Research & Technologies Limited","VARTA AKTIENGESELLSCHAFT","DE000A0TGJ55","0,58","2023-10-19"
    "Marshall Wace LLP","adidas AG","DE000A1EWWW0","0,69","2023-10-19"
    "Marble Bar Asset Management LLP","SÜSS MicroTec SE","DE000A1K0235","1,17","2023-10-19"
    "Qube Research & Technologies Limited","flatexDEGIRO AG","DE000FTG1111","0,82","2023-10-19"
    
    ...