pythonbeautifulsoup

Scrape span tag


I have problem to scrape this code. enter image description here

This is my code: enter image description here


Solution

  • You don't need regex selector, just select the nearest class then use .next_sibling

    ranks = soup.select('.item_color_title')
    for rank in ranks:
        rank = rank.next_sibling
        # cleanup non alphanumeric: "     520th (" ==> 520th
        rank = re.sub(r'\W', '', rank)
        print(rank)