pythonhtmlxmlparsing

Parse elements from HTML


I'm looking to just create a csv file of animal hospitals by state. I think my selecting of html is incorrect. I want to iterate through the elements selecting the right tags to parse the state, name, address, phone #.

from lxml import html
import requests

link = "https://vcahospitals.com/find-a-hospital/location-directory"
response = requests.get(link, allow_redirects = False) #get page data from server, block redirects
sourceCode = response.content #get string of source code from response
htmlElem = html.document_fromstring(sourceCode) #make HTML element object

print(sourceCode)

[Example page html. I've tried selecting all div elements as classes][1]

I would think this grabs all the state hospitals, but it only prints out one state's worth


Solution

  • You have indented the print statement in your code wrong.

    for el in state_hospitals:
        text = el.text_content()
    
        # indented in the for block.
        print (text)