pythonscrapypyspider

Trouble writing Scrapy selector


Very new to python, trying to explore the possibility of importing a long developed project from another language and a buddy swears that Python is my answer. I have the IDE up and running, scrapy working properly and properly kicking the 'name' and 'rank' listed on the website conveniently to a .csv.

Problem arises in that I have spent the last hour trying to figure out how to extract the 'team player' field on the website. It is a span, it is the first instance I have encountered with scrapy that has a space in the namespace, which seems ill advised.

Below is my code, everything works fine aside from pulling the "team position" last line. The code presented is but a representation of the many iterations I have been through trying to get this. Any help would be greatly appreciated.

import scrapy


class CBS200Spider(scrapy.Spider):
name = "expr"
start_urls = [
    'https://www.cbssports.com/fantasy/football/rankings/ppr/top200/',
    #'https://www.cbssports.com/fantasy/football/rankings/standard/top200/',
]

def parse(self, response):
    for plyr in response.css('div.player-row'):
        yield {
            'name': plyr.css('.player-name::text').get(),
            'rank': plyr.css('.rank::text').get(),
            'team': plyr.css('team position::text').get(),
        }

Solution

  • For CSS team and position are two classes and you have to use dot two times - without space.

     '.team.position::text'
    

    BTW: xpath treats "team position" as one name.