rweb-scrapinglapplyrvest

R - Scrape multiple url and wirte each data url in different excel data sheets


I'm trying to scrape different URL and write data in the same Excel's file but in a single page for each URL.

My code is this:

#install.packages("rvest")

library(XLConnect)
library(rvest)
{
 for(i in c("2086","2167","2204")) {
   url<-paste0("https://www.silversanz.com/producto/",i,)

}
 dades<-read_html(url)

 nom<-dades %>% html_nodes("h1.title") %>% html_text() %>% trimws()
 preu<-dades %>% html_nodes("p.price--current") %>% html_text() %>% trimws()

 info<-as.data.frame(cbind(nom,preu))

 writeWorksheetToFile(file="C:/xxx.xxx.xlsx",
                   data=info,
                   sheet= "test",
                   clearSheets=TRUE
 )
}

I have two problems:

Thanks in advance :-)


Solution

  • You have used the brackets in the wrong way. The for-loop that you have written iterates over the numbers and saves the last one in url. Your for-loop instead should contain all of your code:

    library(XLConnect)
    library(rvest)
    
    for(i in c("2086","2167","2204")) {
    
       url<-paste0("https://www.silversanz.com/producto/",i)
    
       dades<-read_html(url)
    
       nom<-dades %>% html_nodes("h1.title") %>% html_text() %>% trimws()
       preu<-dades %>% html_nodes("p.price--current") %>% html_text() %>% trimws()
    
       info<-as.data.frame(cbind(nom,preu))
    
       writeWorksheetToFile(file="C:/xxx.xxx.xlsx",
                         data=info,
                         sheet= i,
                         clearSheets=TRUE)
    }
    

    As for the sheet, now that everything is in the loop, just use i as the sheet name in order to have one sheet per url.