pythonweb-scrapingbeautifulsoupmechanicalsoup

BeautifulSoup and MechanicalSoup won't read website


I am dealing with BeautifulSoup and also trying it with MechanicalSoup and I have got it to load with other websites, but when I request that the website be requested it takes a long time and then never really gets it. Any ideas would be super helpful.

Here is the BeautifulSoup code that I am writing:

import urllib3
from bs4 import BeautifulSoup as soup

url = 'https://www.apartments.com/apartments/saratoga-springs-ut/1-bedrooms/?bb=hy89sjv-mN24znkgE'

http = urllib3.PoolManager()

r = http.request('GET', url)

Here is the Mechanicalsoup code:

import mechanicalsoup

browser = mechanicalsoup.Browser()

url = 'https://www.apartments.com/apartments/saratoga-springs-ut/1-bedrooms/'
page = browser.get(url)
page

What I am trying to do is gather data on different cities and apartments, so the url will change to have be 2-bedrooms and then 3-bedrooms then it will move to a different city and do the same thing there, so I really need this part to work.

Any help would be appreciated.


Solution

  • import urllib3
    import requests
    from bs4 import BeautifulSoup as soup
    
    headers = requests.utils.default_headers()
    headers.update({
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36'
    })
    
    url = 'https://www.apartments.com/apartments/saratoga-springs-ut/1-bedrooms/'
    
    r = requests.get(url, headers=headers)
    
    rContent = soup(r.content, 'lxml')
    
    rContent
    

    Just as Tim said, I needed to add headers to my code to ensure that it was being read as not from a bot.