rgoogle-bigquerybigrquery

bigrquery: Create a BigQuery table with geoJSON files doesn't work


I'd like to create a BigQuery table with geoJSON files, despite the geoJSONis an accepted format in BQ (NEWLINE_DELIMITED_JSON) and bq_fields specification, or something coercible to it (like a data frame) the function bq_table_create() of the bigrquery package doesn't work. In my example below the output error is Erro: Unsupported type: list:

library(sf)
library(bigrquery)
library(DBI)
library(googleAuthR)
library(geojsonsf)
library(geojsonR)


# Convert shapefile to geoJSON 
stands_sel <- st_read(
  "D:/Dropbox/Stinkbug_Ml_detection_CMPC/dashboard/v_08_CMPC/sel_stands_CMPC.shp")

# Open as geoJSON
geo <- sf_geojson(stands_sel)

# Convert geoJSON to data frame 
geo_js_df <- as.data.frame(geojson_wkt(geo))
str(geo_js_df)
# 'data.frame': 2 obs. of  17 variables:
#  $ SISTEMA_PR: chr  "MACRO ESTACA - EUCALIPTO" "SEMENTE - EUCALIPTO"
#  $ ESPECIE   : chr  "SALIGNA" "DUNNI"
#  $ ID_UNIQUE : chr  "BARBANEGRA159A" "CAMPOSECO016A"
#  $ CICLO     : num  2 1
#  $ LOCALIDADE: chr  "BARRA DO RIBEIRO" "DOM FELICIANO"
#  $ ROTACAO   : num  1 1
#  $ CARACTER_1: chr  "Produtivo" "Produtivo"
#  $ VLR_AREA  : num  8.53 28.07
#  $ ID_REGIAO : num  11 11
#  $ CD_USO_SOL: num  2433 9053
#  $ DATA_PLANT: chr  "2008/04/15" "2010/04/15"
#  $ ID_PROJETO: chr  "002" "344"
#  $ CARACTERIS: chr  "Plantio Comercial" "Plantio Comercial"
#  $ PROJETO   : chr  "BARBA NEGRA" "CAMPO SECO"
#  $ ESPACAMENT: chr  "3.00 x 2.50" "3.5 x 2.14"
#  $ CD_TALHAO : chr  "159A" "016A"
#  $ geometry  :List of 2
#   ..$ : 'wkt' chr "MULTIPOLYGON (((-51.2142 -30.3517,-51.2143 -30.3518,-51.2143 -30.3518,-51.2143 -30.3519,-51.2143 -30.3519,-51.2"| __truncated__
#   ..$ : 'wkt' chr "MULTIPOLYGON (((-52.3214 -30.4271,-52.3214 -30.4272,-52.3214 -30.4272,-52.3215 -30.4272,-52.3215 -30.4272,-52.3"| __truncated__
#  - attr(*, "wkt_column")= chr "geometry"

# Insert information inside BQ
bq_conn <-  dbConnect(bigquery(),
                       project = "my-project",
                       use_legacy_sql = FALSE
)

# First create the table
players_table = bq_table(project = "my-project", dataset = "stands_ROI_2021", table = "CF_2021")
bq_table_create(x = players_table, fields = as_bq_fields(geo_js_df))
Erro: Unsupported type: list

Solution

  • You can upload data frame with a list-type column on BigQuery by using bq_table_upload() syntax. Try this on your script instead of bq_table_create(),

    bq_table_upload(players_table, geo_js_df)
    

    For your reference, I tried this on my end using this sample data with a list-type column:

    d <- data.frame(id = 1:2,
                       name = c("Jon", "Mark"),
                       children = I(list(c("Mary", "James"),
                                         c("Greta", "Sally")))
                    )
    

    R console:

    enter image description here

    Created BQ table:

    enter image description here

    enter image description here

    EDIT:

    As per this documentation, FeatureCollection is not yet supported in BigQuery, however there is an ongoing feature request you can find here. Workaround is to convert the GeoJson file to BigQuery new-line-delimited JSON before converting it to dataframe.

    To convert GeoJson file to BigQuery new-line-delimited JSON, follow these steps:

    1. Install node.js.
    2. Add packages:
    npm install fs JSONStream line-input-stream yargs
    
    1. Clone the github repository:
    git clone https://github.com/mentin/geoscripts.git
    
    1. Change directory:
    cd geoscripts/geojson2bq/
    
    1. Convert GeoJson file to BigQuery new-line-delimited JSON:
    node geojson2bqjson.js sel_stands.geojson > out.json
    

    Using the new-line-delimited JSON file, convert this to dataframe in the R console, then use bq_table_upload() to upload the data to BigQuery.

    library(bigrquery)
    library(dplyr)
    library(tidyverse)
    library(jsonlite)
    
    out <- stream_in(file('out.json'))
    
    projectid<-"my-project"
    datasetid<-"my-dataset"
    
    bq_conn <-  dbConnect(bigquery(), 
                          project = projectid,
                          dataset = datasetid, 
                          use_legacy_sql = FALSE)
    
    players_table = bq_table(project = "my-project", dataset = "my-dataset", table = "CF_2021_test5")
    
    bq_table_upload(players_table, out)
    
    bq_table_download(players_table)
    

    R console: enter image description here

    BigQuery table:

    enter image description here

    enter image description here