jsonrgoogle-finance-api

R: getting google finance JSON data into a dataframe


I am trying to get google finance JSON data into a dataframe. I tried:

 library(jsonlite)
dat1 <- fromJSON("http://www.google.com/finance/info?q=NSE:%20AAPL,MSFT,TSLA,AMZN,IBM")
dat1

However I get an error:

Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) : parse error: trailing garbage

Thank you for any help.


Solution

  • I could not replicate your error using fromJSON due to proxy issues from my side but the following works using httr

    require(jsonlite)
    require(httr)
    
    #Set your proxy setting if needed
    #set_config(use_proxy(url='hostname',port= port,username="",password=""))
    
    url.name = "http://www.google.com/finance/info?q=NSE:%20AAPL,MSFT,TSLA,AMZN,IBM"
    
    url.get = GET(url.name)
    
    
    #parsing the content as json results in similar error as you encountered
    
    #url.content = content(url.get,type="application/json")
    #Error in parseJSON(txt) : parse error: trailing garbage
    #          " : "0.57" ,"yld" : "2.46" } ,{ "id": "358464" ,"t" : "MSFT"
    #                     (right here) ------^
    
    
    
    
    
    #read content as html text
    url.content = content(url.get, as="text")
    
    #remove html tags
    clean.text = gsub("<.*?>", "", url.content)
    
    #remove residual text
    clean.text = gsub("\\n|\\//","",clean.text)
    
    DF = fromJSON(clean.text)
    
    head(DF[,1:10],5)
    
    #        id    t      e      l  l_fix  l_cur s        ltt                 lt               lt_dts
    #1    22144 AAPL NASDAQ  92.51  92.51  92.51 1 4:00PM EDT May 11, 4:00PM EDT 2016-05-11T16:00:02Z
    #2   358464 MSFT NASDAQ  51.05  51.05  51.05 1 4:00PM EDT May 11, 4:00PM EDT 2016-05-11T16:00:02Z
    #3 12607212 TSLA NASDAQ 208.96 208.96 208.96 1 4:00PM EDT May 11, 4:00PM EDT 2016-05-11T16:00:02Z
    #4   660463 AMZN NASDAQ 713.23 713.23 713.23 1 4:00PM EDT May 11, 4:00PM EDT 2016-05-11T16:00:02Z
    #5    18241  IBM   NYSE 148.95 148.95 148.95 2 6:59PM EDT May 11, 6:59PM EDT 2016-05-11T18:59:12Z