pythonweb-scrapingpassword-protection

What Python tools can I use to write a scraper of a password-protected webpage?


Suppose there is a password-protected website that I want to access to scrape some info from it and put it into a spreadsheet. For example, it could be my personal credit card account page and I would be scraping info about the latest transactions.

A variation of this would be if the site allowed to download the transaction info as a CSV file, in which case I would want to download that file.

If I want to write such scraper in Python, what packages should I use for the task? Does it depend on how a specific website is implemented, i.e. I might need one tool to scrape one site and another tool to scrape another.

Thank you


Solution

  • I actually did something very similar to this, but in node. Are you definitely wanting to do this in Python?

    If you want to stick to Python, take a look at these modules:

    BeautifulSoup

    requests

    Someone wrote a really awesome module combining the above two modules:

    Robobrowser

    If you would like to venture down the node route, take a look at this:

    nightmarejs