FIBA used to have a page that was much more accessible to grab play-by-play and box score data from (example: scrape fiba stats box score).
They have a new game page that looks like this https://www.fiba.basketball/en/events/fiba-americup-2025-qualifiers/games/120186-DOM-MEX#shotChart
I used dev tools to examine the elements and xhr to see if there was an obvious place the underlying data was being held, but I can't see anything that is obvious.
There is one area nested in a script
node that seems to have what I would expect to be the underlying play-by-play data. xpath = /html/body/script[47]/text()
If I pull that specific xpath, I can't seem to parse what comes back because there are so many extra backslashes it seems to ruin the structure.
page <- 'https://www.fiba.basketball/en/events/fiba-americup-2025-qualifiers/games/120186-DOM-MEX#shotChart'
my_session <- session(url = page)
my_session %>% html_nodes(xpath = '/html/body/script[47]/text()')
Hoping get guidance on 1 of 2 things.
Yes it looks like the information is stored in that location as a javascript string as JSON data in JSON.
This is a matter of reading the string removing the extra characters at the beginning and end and covering from JSON. With how the data is structured, it took some trial and error and two steps to get the desired information and store in the "game" variable.
library(rvest)
page <- 'https://www.fiba.basketball/en/events/fiba-americup-2025-qualifiers/games/120186-DOM-MEX#shotChart'
my_session <- session(url = page)
text <-my_session %>% html_elements(xpath = '/html/body/script[47]/text()') %>% html_text()
#remove extra characters at the start and end then extract JSON out
temp <- substr(text, 20, nchar(text)-1) %>% jsonlite::fromJSON()
#repeat on the second list item
data<-jsonlite::fromJSON(substr(temp[[2]], 4, nchar(temp[[2]])))
#the desired information is stored in the fourth list item
game<-data[[4]]
game$playersTeamA
game$playersTeamB
game$sidebar
game$minimal
game$gameData
game$status
game$teamColors
game$game
game$playByPlay