I've been reading through a couple of the other useful guides on pulling player and match data from ESPN using R, however I have come across a problem with tabbed tables. As shown here on the player stats for a recent rugby game, the player statistics table is tabbed into 'Scoring', 'Attacking', 'Defending' and 'Discipline'.
Using the following code (with the help of two lovely packages (RCurl and htmltab), I can pull out the first tab ('Scoring') from that page ...
# install & attach RCurl
if (!base::require(package="RCurl")) utils::install.packages("RCurl")
library(RCurl)
# install & attach htmltab
if (!base::require(package="htmltab")) utils::install.packages("htmltab")
library(htmltab)
# assign URL
theurl <- RCurl::getURL("https://www.espn.co.uk/rugby/playerstats?gameId=294854&league=270557",.opts = list(ssl.verifypeer = FALSE))
# pull tables from url
team1 <- htmltab::htmltab(theurl,which=1)
team2 <- htmltab::htmltab(theurl,which=2)
league <- htmltab::htmltab(theurl,which=3)
... in the following format, which is exactly what I wanted ...
team1
rowID LEINS Tx TA CG PG PTS
2 J LarmourFB 0 0 0 0 0 0
3 H KeenanW 0 0 0 0 0 0
4 G RingroseC 0 0 0 0 0 0
5 R HenshawC 1 0 0 0 0 5
6 J LoweW 1 0 0 0 0 5
7 R ByrneFH 0 0 2 2 0 10
8 J Gibson-ParkSH 0 1 0 0 0 0
9 C HealyP 0 0 0 0 0 0
10 R KelleherH 0 0 0 0 0 0
11 A PorterP 0 0 0 0 0 0
... however I seem unable to pull out any tab other than 'Scoring'. I'm sure I'm missing something really obvious, so would appreciate someone pointing out where I'm going wrong!
Thanks in advance!
if you check the source html-page you will see that the data is not there at the start. You can find a data-reactid
-tag that indicates that the data is only loaded once you click on the new tab. So you will need to find a way to make that click on the second tab.
One option for you might be to use Selenium: https://www.rdocumentation.org/packages/RSelenium/versions/1.7.7 This would enable you to make the necessary button click.
A sample can be found here: https://www.r-bloggers.com/2014/12/scraping-with-selenium/