pythonscrapyscraper

python, failing to login on a website using scrapy


I am trying to connect to indeed using scrapy.

I coded this part to try to log in, following examples from scrapy doc and a code review topic.

class IndeedSpider(scrapy.Spider):
        name = 'indeed'
        allowed_domains = ['indeed.com'
        ]
        start_urls = ['https://secure.indeed.com/account/login'
        ]

        def parse(self, response):
                return scrapy.FormRequest.from_response( #send a request
                        response,
                        formxpath='//form[@id="signin_email"]', #xpath to the logging form
                        formdata={
                                'password': 'mypassword', #html input type field
                                'email': 'mymail', #other input type field
                                'Action':'/account/login', #form action field
                        },
                        callback=self.after_login #do something                    )

The program exits after failing :

File "/usr/local/lib/python2.7/dist-packages/scrapy/http/request/form.py", line 77, in _get_form raise ValueError("No element found in %s" % response) ValueError: No element found in <200 https://secure.indeed.com/account/login>

It seems like scrapy can't find the form. I tried changing the FormRequest parameters to value field but I can't get it to connect.


Solution

  • As @eLRuLL said, it is because this form depends on javascript.