rjsonrjsonio

Parse list of JSON URLs through fromjson


I have a table of urls that each point to a certain json output. I would like to parse them through fromJSON (or any other json parser), in order to extract data from these json outputs, and combine it together in a list of lists.

My code is set up as follows:

pages <- list()
for (i in 1:length(urltable))
{
mydata<-fromJSON(urltable[i], flatten=TRUE)
pages[[i]] <- mydata$entries
}

which renders the errorterm:

Error in (function (classes, fdef, mtable) :
unable to find inherited method for function 'fromJSON' for signature '"list", "missing"'

If i test it with pasting a single url in fromJSON(), it works, so i suppose the problem lies in the fact that fromJSON does not read the table?

Anyone having a suggestion of how to do this?

Addition: The urltable is a table of 1 column, and 326 rows. the head of the table is:

    url
1     http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aalzum&start=10
2     http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aalzum&start=20
3 http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=10
4 http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=20
5 http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=30
6 http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=40

Addition 2 subset of urltable, dput(subset_urltable):

structure(list(url = c("http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aalzum&start=10","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aalzum&start=20","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=10","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=20","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=30","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=40","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=50","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=60","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=70","http://zoeken.kvk.nl/Jsonsearch.ashx?site=handelsregister&partialfields=&q=Aardenburg&start=80")), row.names = c(NA, -10L), class = "data.frame", .Names = "url")

Solution

  • fromJSON expects a json string, in your case you are trying to retrieve json data and convert it in a single go. You will have to feed data from your url to fromJSON. Do something like this

    mydata<-fromJSON(url(urltable[i]), flatten=TRUE)
    

    url will feed extracted data to fromJSON function.

    Complete solution should look like

    pages <- list()
    for (i in 1:length(urltable))
    {
    mydata<-fromJSON(url(as.character(urltable[i])), flatten=TRUE)
    pages[[i]] <- mydata$entries
    }
    

    with curl package installed you can do without explicitly using url function. Also, if you want to iterate over all rows of urltable do not use length(urltable) as it will return the no of columns in your data frame which is 1 here, instead do length(urltable$url)

    pages <- list()
    for (i in 1:length(urltable$url))
    {
        mydata<-fromJSON(as.character(urltable$url[i]), flatten=TRUE)
        pages[[i]] <- mydata$entries
    }