javascriptpythonweb-scrapingscrapyscrapy-splash

Css selector returns blank list


Hi there I am new to scrapy and web scraping in general and I am having a hard time with trying to scrape from this website: https://www.webuycars.co.za/buy-a-car

My goal is to scrape the car data like the name, price etc from the page

I started with

scrapy shell "https://www.webuycars.co.za/buy-a-car"

I then did

fetch("http://localhost:8050/render.html?url=https://www.webuycars.co.za/buy-a-car")

I am using splash with scrapy because I have come to the conclusion that the page was created with javascript I then tried to send some requests but after a certain point in the html of the page I start getting blanks(this is what I assume to be javascript created) For example

response.css("div.col-lg-3.col-md-4.col-sm-6.mt-3").getall()
[]
response.css("div.result-item-title").getall() 
[]
response.css("div.result-item-title").get()
response.css(".result-item-title").getall()
[]

Getting the title seems to work but nothing else I have tried works

response.css("title::text").get()
'WeBuyCars | Sell Cars For Cash | Free Online Vehicle Valuations'

I have been trying to do these requests to make sure I get results before I program the spider and implement it properly into my program. I set my user agent in the settings file. I have looked at all the source files to see if there was a json file containing what I needed but there isnt one. I am not sure what else I can do. I have been stuck on this problem for quite a while and I would appreciate any help.


Solution

  • You can get all data from API response

    import json
    import scrapy
    
    class CarsSpider(scrapy.Spider):
    
        name = 'car'
        body = {"to":24,"size":24,"type":"All","filter_type":"all","subcategory":None,"q":"","Make":None,"Roadworthy":None,"Auctions":[],"Model":None,"Variant":None,"DealerKey":None,"FuelType":None,"BodyType":None,"Gearbox":None,"AxleConfiguration":None,"Colour":None,"FinanceGrade":None,"Priced_Amount_Gte":0,"Priced_Amount_Lte":0,"MonthlyInstallment_Amount_Gte":0,"MonthlyInstallment_Amount_Lte":0,"auctionDate":None,"auctionEndDate":None,"auctionDurationInSeconds":None,"Kilometers_Gte":0,"Kilometers_Lte":0,"Priced_Amount_Sort":"","Bid_Amount_Sort":"","Kilometers_Sort":"","Year_Sort":"","Auction_Date_Sort":"","Auction_Lot_Sort":"","Year":[],"Price_Update_Date_Sort":"","Online_Auction_Date_Sort":"","Online_Auction_In_Progress":""}
    
        def start_requests(self):
            yield scrapy.Request(
                url='https://website-elastic-api.webuycars.co.za/api/search',
                callback=self.parse,
                body=json.dumps(self.body),
                method="POST")
    
        def parse(self, response):
            response = json.loads(response.body)
           
            for resp in response['data']:
                yield {
                    'Title': resp['OnlineDescription']
                }
    

    Output:

    {'Title': '2022 Citroen C3 Aircross 1.2T Puretech Sine Auto'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Toyota Hilux 2.4 Gd-6 RB Raider Pick Up Double Cab'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2020 Datsun GO 1.2 MID'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2013 Hyundai i10 1.25 Gls/fluid Auto'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2020 Suzuki S-Presso 1.0 GL+ AMT'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2019 SYM Symphony JET 14 200'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2019 Nissan Micra 1.2 Active Visia'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2021 Suzuki Super Carry 1.2i Pick Up Single Cab'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Suzuki AN UB 125 (burgman)'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Honda XRL XR 125l'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Toyota Hilux 2.4 Gd-6 RB Raider Pick Up Double Cab'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Land Rover Defender 110 D300 SE X-Dynamic (221 KW)'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2020 Suzuki S-Presso 1.0 GL'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Big Boy TSR 250'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Hyundai Atos/Atoz 1.1 Motion AMT'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2019 Fiat Panda 900t Lounge'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2017 Chevrolet Spark 1.2 Campus/curve 5-Door'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2020 Crosby Adventure Bike 400cc'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Renault Kwid 1.0 Climber 5-Door'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2019 Suzuki Swift 1.2 GLX'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 Volkswagen Polo Classic GP 1.4 Comfortline'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2020 Renault Kwid 1.0 Climber 5-Door Auto'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2022 SYM Crox X-Pro 125'}
    2022-05-01 08:15:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://website-elastic-api.webuycars.co.za/api/search>
    {'Title': '2019 Yamaha YZ 450 FX'}