rdatabricksimpalasparkr

Error when using SparkR insertInto via databricks


I am trying to insert values from a dataframe into a database table (impala) using SparkR in a databricks notebook:

require(SparkR)
test_df <- data.frame(row_no = c(2,3,4,5,6,7,8)
                    ,row_dat = c('dat_2','dat_3','dat_4','dat_5','dat_6','dat_7','dat_8')
                    )
test_df <- as.data.frame(test_df)

sparkR.session()
insertInto(test_df,"db_name.table_name",overwrite = false)

I get the error: "unable to find an inherited method for function ‘insertInto’ for signature ‘"data.frame", "character"’"

I have checked the connection to this table and using SparkR::collect I can return the data from it no problem. So why isn't the insert working?


Solution

  • Instead of as.data.frame that returns R dataframe you need to use as.DataFrame that returns Spark dataframe that could be used with insertInto (see doc). Change code to:

    require(SparkR)
    test_df <- data.frame(row_no = c(2,3,4,5,6,7,8)
                        ,row_dat = c('dat_2','dat_3','dat_4','dat_5','dat_6','dat_7','dat_8')
                        )
    test_df <- as.DataFrame(test_df)
    
    sparkR.session()
    insertInto(test_df,"db_name.table_name",overwrite = FALSE)