My R code (see below) generates these errors in some cases:
[1] "2023-08-12 16:47:37.463"
Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: api.abc.com
Request failed [ERROR]. Retrying in 1.3 seconds...
Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: api.abc.com
Request failed [ERROR]. Retrying in 1 seconds...
Error in curl::curl_fetch_memory(url, handle = handle):
Could not resolve host: api.abc.com
api.abc.com is not the original API I use. I use a commercial API which noticed me their server was not down at the particular moment above. In some cases when the server was down it returned http-code 503.
I have two questions:
RETRY
in my code with GET
.My code below is called every 10 seconds with the scheduler tclTaskSchedule
(see end of code). In this examplecode I have used a free API (universities.hipolabs.com) as example.
library(httr) # accessing API's'
library(jsonlite) # JSON parsing
library(dplyr)
library(readr)
library(purrr)
library(tidyr)
library(stringr)
library(tibble)
library(tcltk2)
library(lubridate)
run_api_once <- function() {
mydatalist <- list() #create an empty list
my_next_page_with_number <- "http://universities.hipolabs.com/search?country=United+States"
mydata1 <- RETRY("GET", my_next_page_with_number)
if(mydata1$status_code != 200){
print(mydata1$status_code)
http_responses <<- append(http_responses, paste(mydata1$status_code, Sys.time()))
has_more_pages <- FALSE
} else {
rawdata <- rawToChar(mydata1$content)
mydata2 <- fromJSON(rawdata, flatten = FALSE, simplifyVector = FALSE)
mydata <- mydata2
mydatalist <- c(mydatalist, mydata)
}
y <- Sys.time()
y <- format(y, "%Y-%m-%d %H:%M")
print(y)
users <- tibble(user = mydatalist)
myvar <<- users %>% unnest_wider(user)
return(myvar)
}
# call function every 10 seconds:
tclTaskSchedule(10000, run_api_once(), id = "run_api_once", redo = TRUE)
# end session:
tclTaskDelete(NULL)
I suppose it is irrelevant, although for completeness: I stream the content of myvar to a local server on my pc with Plumber. See code below:
# stream df myvar to local api at port 8405:
library(plumber)
pr("D:/plumber_universities2test.R") %>%
# pr("C:/plumber_universities2test.R") %>%
pr_run(port=8405)
Which calls this script:
library(plumber)
library(dplyr)
#* @param symbol Ticker symbol (just to input something in the function)
#* @get /return
#* @serializer json list(na="string")
universities_data <- function(symbol) {
data <- myvar
data
}
Thanks a lot!
To answer your questions:
httr
; or you are making a request to an invalid URL. I can't be sure without seeing the actual URL you are making the request to, but I would guess the third option is the most likely. You should check if you are making a mistake while pasting together a particular URL. For example "google.comsearch"
instead of "google.com/search"
RETRY
is not acting in the way you expect is because this is not an HTTP error status returned by the server, but your request simply can't be executed. To demonstrate the difference, let's have a look at the behaviour of a simple function that makes a request to a URL that automatically returns an HTTP error and one that does not exist at all:library(httr)
test_fun <- function(u) {
RETRY("GET", u, times = 2)
print("still running")
}
# response contains error
test_fun("https://httpbin.org/status/429")
#> Request failed [429]. Retrying in 1 seconds...
#> [1] "still running"
# no repsonse since there is no server at `test.coms`
test_fun("test.coms")
#> Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: test.coms
#> Request failed [ERROR]. Retrying in 1 seconds...
#> Error in curl::curl_fetch_memory(url, handle = handle): Could not resolve host: test.coms
Created on 2023-08-13 with reprex v2.0.2
As you can see, the first example still executes the remaining code of the function while the second one stops with an error. I would suggest to carefully check why the requests are not getting to the server and if you are certain that there is no better way, you can wrap try
around RETRY
:
mydata1 <- try(RETRY("GET", my_next_page_with_number))
if (is(mydata1, "try-error")) mydata1 <- list(status_code = 404)
if(mydata1$status_code != 200){
# your code ...
}
But the behaviour of RETRY
is correct in my opinion as it is not simply ignoring what is probably a mistake in the code or your internet configuration (not a server side issue).