rlistdataframegoogle-places-apigoogleway

Google Places API and R -- calling 2nd column in a data frame returns six separate columns


I'm trying to store the results of a data frame I retrieved from a list via the Google Places API. My call to the API...

library(googleway)

HAVE_PLACES <- google_places(search_string = "grocery store",
                           location = c(35.4168, -80.5883),
                           radius = 10000, key = key)

...returns a list object HAVE_PLACES:

enter image description here

The third object in this list - results - is a data frame with one observation for each location retrieved in the API call. When I call View(HAVE_PLACES$results), I get what looks like a set of vectors - as I expect when looking at a data frame...

enter image description here

...But it looks like the data frame includes data frames:

enter image description here

WHAT IS GOING ON HERE?

enter image description here

More specifically:

  1. How can a data frame contain data frames, and why does View() show the nested data frames as it would vectors?
  2. When working with data of this type, where you want the columns you're seeing in View() to simply be vectors - for manipulation and exporting purposes - are there any best practices? I'm about to convert each vector of this alleged data frame called geometry into separate objects, and cbind() the results to the HAVE_PLACES$results. But this feels insane.

Solution

  • Akrun is right (as usual!). A data.frame can have lists as 'columns'. This is normal behaviour.

    Your question seems to be a more general question about how to extract nested list data in R, but using Google's API response as an example. Given you're using googleway (I'm the author of the pacakge), I'm answering it in the context of Google's response. However, there are numerous other answers and examples online about how to work with lists in R.

    Explanation

    You're seeing the nested lists in your results because the data returned from Google's API is actually JSON. The google_places() function 'simplifies' this to a data.frame using jsonlite::fromJSON() internally.

    If you set simplify = F in the function call you can see the raw JSON output

    library(googleway)
    
    set_key("GOOGLE_API_KEY")
    
    HAVE_PLACES_JSON <- google_places(search_string = "grocery store",
                                 location = c(35.4168, -80.5883),
                                 radius = 10000, 
                                 simplify = F)
    
    ## run this to view the JSON.
    jsonlite::prettify(paste0(HAVE_PLACES_JSON))
    

    You'll see the JSON can contain many nested objects. When converted to an R data.frame these nested objects are returned as list columns'

    If you're not familiar with JSON it may be worth a bit of research to see what it's all about.


    Extracting Data

    I've written some functions to extract useful pieces of information from the API responses which may be of help here

    locations <- place_location(HAVE_PLACES)
    head(locations)
    #        lat       lng
    # 1 35.38690 -80.55993
    # 2 35.42111 -80.57277
    # 3 35.37006 -80.66360
    # 4 35.39793 -80.60813
    # 5 35.44328 -80.62367
    # 6 35.37034 -80.54748
    
    placenames  <- place_name(HAVE_PLACES)
    head(placenames)
    # "Food Lion" "Food Lion" "Food Lion" "Food Lion" "Food Lion" "Food Lion"
    

    However, note that you will still get some list objects returned, because, in this case, a 'location' can have many 'types'

    placetypes <- place_type(HAVE_PLACES)
    str(placetypes)
    
    # List of 20
    # $ : chr [1:5] "grocery_or_supermarket" "store" "food" "point_of_interest" ...
    # $ : chr [1:5] "grocery_or_supermarket" "store" "food" "point_of_interest" ...
    # $ : chr [1:5] "grocery_or_supermarket" "store" "food" "point_of_interest" ...
    # $ : chr [1:5] "grocery_or_supermarket" "store" "food" "point_of_interest" ...
    

    Summary

    With Google's API responses you will have to extract the specific data elemets you want and construct them into your required object

    df <- cbind(
      place_name(HAVE_PLACES)
      , place_location(HAVE_PLACES)
      , place_type(HAVE_PLACES)[[1]]   ## only selecting the 1st 'type'
    )
    
    head(df)
    
    #   place_name(HAVE_PLACES)      lat       lng place_type(HAVE_PLACES)[[1]]
    # 1               Food Lion 35.38690 -80.55993       grocery_or_supermarket
    # 2               Food Lion 35.42111 -80.57277                        store
    # 3               Food Lion 35.37006 -80.66360                         food
    # 4               Food Lion 35.39793 -80.60813            point_of_interest
    # 5               Food Lion 35.44328 -80.62367                establishment
    # 6               Food Lion 35.37034 -80.54748       grocery_or_supermarket