pythonparsingextract

Extracting data from an html response in python


as a response of a request, i'm getting a full 1600 lines html document.

what i'm trying to do is find a way to extract a value from a specific line:

    <input type="hidden" id="form__token" name="form[_token]" data-parsley-errors-container="#form__token_error" value="tHV9QvBk9HEvZSP8S8bCkpC1vsSE4B4HthgXgk4V7FM" /></form>

at line 1594 of my document, i'm trying to get the value of value. What i thought of doing was to do extract the tag value and its value to then delete everything that was not that but the tag does appear elsewhere in my html file so there is no point.


Solution

  • You will need 'request' and 'BeautifulSoup' for getting the data you want from the said url

    Try:

    from bs4 import BeautifulSoup
    import requests
    
    url = ('link to url')
    
    page = requests.get(url, timeout = 5) #timeout only if required
    soup = BeautifulSoup(page.text, 'html.parser') 
    
    value = soup.find(id='form__token')
    
    print(value)