I'm very new to python and trying to webscrape soccer matches for 'today' from the fox sports website: https://www.foxsports.com/scores/soccer. Unfortunately, I keep running into issues with
'AttributeError: 'NoneType' object has no attribute 'find_all''
and can't seem to get the teams for that day. This is what I have so far:
import bs4
import requests
res = requests.get('foxsports.com/scores/soccer')
soup = bs4.BeautifulSoup(res.text, 'html.parser')
results = soup.find("div", class_="scores-date")
games = results.find("div", class_="scores")
print(games)
Content is not static it is served dynamically by website, so request won't get the information you can see in your dev tools.
Use an api provided or selenium
that handels content like a browser and can provide the page_source
you are looking for.
Cause not all of the content is provided directly, you have to use selenium waits to locate the presence of the <span>
with class "title-text".
Note Example uses selenium 4, so check your version, update or adapt requiered dependencies to a lower version by yourself
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
service = ChromeService(executable_path='ENTER YOUR PATH TO CHROMEDRIVER')
driver = webdriver.Chrome(service=service)
driver.get('https://www.foxsports.com/scores/soccer')
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//span[contains(@class, "title-text") and text() = "Today"]')))
soup = BeautifulSoup(driver.page_source, 'lxml')
for g in soup.select('.scores-date:not(:has(div)) + div .score-chip-content'):
print(list(g.stripped_strings))
['SERIE A', 'JUVENTUS', '9-4-5', 'JUV', '9-4-5', 'CAGLIARI', '1-7-10', 'CAG', '1-7-10', '8:45PM', 'Paramount+', 'JUV -455', 'CAG +1100']
['LG CUP', 'ARSENAL', '0-0-0', 'ARS', '0-0-0', 'SUNDERLAND', '0-0-0', 'SUN', '0-0-0', '8:45PM', 'ARS -454', 'SUN +1243']
['LA LIGA', 'SEVILLA', '11-4-2', 'SEV', '11-4-2', 'BARCELONA', '7-6-4', 'BAR', '7-6-4', '9:30PM', 'ESPN+', 'SEV +155', 'BAR +180']