I keep running into an issue when I scrape data with lxml by using the xpath. I want to scrape the dow price but when I print it out in python it says Element span at 0x448d6c0. I know that must be a block of memory but I just want the price. How can I print the price instead of the place in memory it is?
from lxml import html
import requests
page = requests.get('https://markets.businessinsider.com/index/realtime-
chart/dow_jones')
content = html.fromstring(page.content)
#This will create a list of prices:
prices = content.xpath('//*[@id="site"]/div/div[3]/div/div[3]/div[2]/div/table/tbody/tr[1]/th[1]/div/div/div/span')
#This will create a list of volume:
print (prices)
You're getting generators which as you said are just memory locations. To access them, you need to call a function on them, in this case, you want the text so .text
Additionally, I would highly recommend changing your XPath since it's a literal location and subject to change.
prices = content.xpath("//div[@id='site']//div[@class='price']//span[@class='push-data ']")
prices_holder = [i.text for i in prices]
prices_holder
['25,389.06',
'25,374.60',
'7,251.60',
'2,813.60',
'22,674.50',
'12,738.80',
'3,500.58',
'1.1669',
'111.7250',
'1.3119',
'1,219.58',
'15.43',
'6,162.55',
'67.55']
Also of note, you will only get the values at load. If you want the prices as they change, you'd likely need to use Selenium.