python-3.xweb-scrapingbeautifulsoup

HTML tag not showing up in soup.prettify?


I'm trying to scrape the table that contains the four factors data from the NBA website. However, after obtanining the data, the tag for the table is no where to be found (it is supposed to be ). This is what I am doing so far, what can I do differently:

import requests
from bs4 import BeautifulSoup 
import pandas as pd

url = "https://www.nba.com/stats/teams/four-factors"
data = requests.get(url).

tables = soup.find_all('table')

table = soup.find('table', class_="Crom_table__p1iZz")
for table in soup.find_all('table'): #this is to check what class there are 
    print(table.get('class'))

Solution

  • The page is rendered through the data from the api call. The static html you pull doesn't have those tags in there for the table you want. Get the data straight from the api.

    import requests
    import pandas as pd
    
    url = "https://stats.nba.com/stats/leaguedashteamstats"
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
        'Referer':'https://www.nba.com/'}
    
    payload = {
    'Conference': '',
    'DateFrom': '',
    'DateTo': '',
    'Division': '',
    'GameScope': '',
    'GameSegment': '',
    'Height': '',
    'ISTRound': '',
    'LastNGames': '0',
    'LeagueID': '00',
    'Location': '',
    'MeasureType': 'Four Factors',
    'Month': '0',
    'OpponentTeamID': '0',
    'Outcome': '',
    'PORound': '0',
    'PaceAdjust': 'N',
    'PerMode': 'PerGame',
    'Period': '0',
    'PlayerExperience': '',
    'PlayerPosition': '',
    'PlusMinus': 'N',
    'Rank': 'N',
    'Season': '2023-24',
    'SeasonSegment': '',
    'SeasonType': 'Regular Season',
    'ShotClockRange': '',
    'StarterBench': '',
    'TeamID': '0',
    'TwoWay': '0',
    'VsConference': '',
    'VsDivision': ''}
    
    
    data = requests.get(url, headers=headers, params=payload).json()
    
    table = pd.DataFrame(data['resultSets'][0]['rowSet'], columns=data['resultSets'][0]['headers'])
    

    Output:

    print(table.head(5).to_string())
          TEAM_ID          TEAM_NAME  GP   W   L  W_PCT     MIN  EFG_PCT  FTA_RATE  TM_TOV_PCT  OREB_PCT  OPP_EFG_PCT  OPP_FTA_RATE  OPP_TOV_PCT  OPP_OREB_PCT  GP_RANK  W_RANK  L_RANK  W_PCT_RANK  MIN_RANK  EFG_PCT_RANK  FTA_RATE_RANK  TM_TOV_PCT_RANK  OREB_PCT_RANK  OPP_EFG_PCT_RANK  OPP_FTA_RATE_RANK  OPP_TOV_PCT_RANK  OPP_OREB_PCT_RANK
    0  1610612737      Atlanta Hawks  46  19  27  0.413  2223.0    0.536     0.270       0.134     0.317        0.578         0.252        0.142         0.297       19      21      22          22        20            21              9               16              4                29                 12                10                 24
    1  1610612738     Boston Celtics  47  36  11  0.766  2281.0    0.566     0.248       0.126     0.289        0.522         0.206        0.119         0.272        4       1       1           1         5             5             19                6             15                 2                  1                28                  4
    2  1610612751      Brooklyn Nets  46  19  27  0.413  2223.0    0.536     0.237       0.129     0.301        0.540         0.251        0.124         0.285       19      21      22          22        20            20             25                8              8                12                 11                27                 17
    3  1610612766  Charlotte Hornets  45  10  35  0.222  2170.0    0.525     0.216       0.140     0.274        0.581         0.264        0.137         0.292       25      27      27          27        26            28             30               20             23                30                 21                16                 19
    4  1610612741      Chicago Bulls  47  22  25  0.468  2286.0    0.529     0.232       0.125     0.289        0.551         0.257        0.146         0.284        4      19      21          20         4            27             28                4             14                18                 15                 6                 16
    ...