Hello, Im new to this fascinating world of r, I have not been able to skip the urls that do not exist, how can I handle it? and don't mark as and error, thanks for your help.
knitr::opts_chunk$set(echo = TRUE) library(xml2) library(rvest) library(tidyverse) library(lubridate) zora_core <- read_html("https://zora.medium.com/the-zora-music-canon-5a29296c6112") Los_100 <- data.frame(album = html_nodes(zora_core, "h1:not(#96c9)") %>% html_text() %>% str_trim(side = "both"), interprete = html_nodes(zora_core, "strong em , p#73e0 strong") %>% html_text() %>% str_remove_all("^by") %>% str_extract("[a-zA-Z].+(?=[(])") %>% str_trim(side = "both"), año = html_nodes(zora_core, "strong em , p#73e0 strong") %>% html_text %>% str_extract("([[:digit:]]){4}"), liga = paste0("https://en.wikipedia.org/wiki/",html_nodes(zora_core, "strong em , p#73e0 strong") %>% html_text() %>% str_remove_all("^by") %>% str_extract("[a-zA-Z].+(?=[(])") %>% str_trim(side = "both") %>% str_replace_all(" ","_")))
carga <- function(url){ perfil_raw <- read_html(url) data.frame(interprete = html_node(perfil_raw, "h1#firstHeading") %>% html_text() %>% str_trim(side = "both")) }
lista <- Los_100$liga[1:16] # THE url for the position 16 don´t exist how to avoid that datos_personales <- map_df(lista,carga)
It's useful to learn about error-handling in R, but when working with http requests it becomes essential.
In your case, it is best to wrap carga
in a tryCatch
. This runs an expression that you pass as the first argument and if an error is thrown, it is caught and passed to the second argument of tryCatch
, which is a function.
If an error is thrown we need to return a data frame with a single column called interprete
so that map_df
can bind it together with the other results:
carga_catch <- function(x)
{
tryCatch(return(carga(x)),
error = function(e) return(data.frame(interprete = "**inexistente**")))
}
map_df(lista, carga_catch)
#> interprete
#> 1 Ella Fitzgerald
#> 2 Sarah Vaughan
#> 3 Billie Holiday
#> 4 Sister Rosetta Tharpe
#> 5 Lena Horne
#> 6 Mahalia Jackson
#> 7 Abbey Lincoln
#> 8 Etta James
#> 9 Leontyne Price
#> 10 Marian Anderson
#> 11 Dinah Washington
#> 12 Odetta
#> 13 Dionne Warwick
#> 14 The Supremes
#> 15 Nina Simone
#> 16 **inexistente**
Apart from error handling, I think your code is very good for someone just beginning in R. It achieves a lot in a few lines of code and is perfectly readable. Good work!