pythonweb-scrapingurl

How to click the 'download as pdf' button on a website with python


Looking to click the download as pdf button on this site: https://www.goffs.com/sales-results/sales/december-nh-sale-2021/1

The reason I can't just scrape the download link or just manually download it is that there are multiple of these sites like:

https://www.goffs.com/sales-results/sales/december-nh-sale-2021/2

https://www.goffs.com/sales-results/sales/december-nh-sale-2021/3

And I want to loop through all of them and download each as a pdf.

Current code: import urllib.request from requests import get from bs4 import BeautifulSoup

url = "https://www.goffs.com/sales-results/sales/december-nh-sale-2021/1"

request = urllib.request.Request(url)
response = urllib.request.urlopen(request)

Solution

  • This code should get the link to the pdf:

    from urllib.request import *
    url = "https://www.goffs.com/sales-results/sales/december-nh-sale-2021/{}".format("1")
    
    request = Request(url)
    response = urlopen(request)
    content = response.read().decode().split('<a href="https://www.goffs.com/GoffsCMS/_Sales/')
    content = content[1].split('"')
    content = content[0]
    output = 'https://www.goffs.com/GoffsCMS/_Sales/'+content
    print(output)