I am working with lxml to try to get the top 10 hits currently on spotify(https://spotifycharts.com/regional). When I run the program, it returns an empty list [] instead of returning ['song 1', 'song 2', etc].
import requests
import lxml.html
html = requests.get("https://spotifycharts.com/regional")
doc = lxml.html.fromstring(html.content)
songs = doc.xpath('//div[@id="content"]')[0]
titles = songs.xpath('.//div[@class="chart-table-track"]/text()')
print(titles)
I'm not sure if it was an xpath problem or not, but when I went to go look for another id on the site, there wasn't any. Also the id "content" is what contained the text that I needed. Same thing for "chart-table-track". Im not sure if I wrote the wrong syntax or not, but any help would be appreciated.
Thanks,
You can try like the following to get the first ten hits (rank
and name
) from that webpage. I used BeautifulSoup
instead of lxml
library to fetch the content.
import requests
from bs4 import BeautifulSoup
html = requests.get("https://spotifycharts.com/regional")
doc = BeautifulSoup(html.content,"lxml")
for items in doc.select('table.chart-table tr')[1:11]:
rank = items.select_one("td.chart-table-position").get_text(strip=True)
name = items.select_one("td.chart-table-track > strong").get_text(strip=True)
print(rank,name)
Output:
1 Blinding Lights
2 The Box
3 Dance Monkey
4 Don't Start Now
5 Roses - Imanbek Remix
6 In Your Eyes
7 death bed (coffee for your head) (feat. beabadoobee)
8 Say So
9 Intentions (feat. Quavo)
10 Falling