I am running the below code in databricks to save a table using sparklyr
library(sparklyr)
library(dplyr)
sc <- sparklyr::spark_connect(method = "databricks")
dat <- sparklyr::spark_read_table(sc, "products.output")
dat <- dat %>% dplyr::mutate(x = as.character(x), y = as.character(y))
%sql
drop table products.output
sparklyr::spark_write_table(x = dat , name = "products.output")
org.apache.spark.sql.AnalysisException:
The schema of your Delta table has changed in an incompatible way since your DataFrame or
DeltaTable object was created. Please redefine your DataFrame or DeltaTable object
Is there anyway I can overwrite the schema?
Same approach as the answer of this question. Following the docs of sparklyr::spark_write_table
, add another argument as options=list(overwriteSchema="true")
. This Databricks doc may help: https://docs.databricks.com/en/delta/update-schema.html#explicitly-update-schema-to-change-column-type-or-name
sparklyr::spark_write_table(x = dat, name = "products.output",
mode = "overwrite",
options = list(overwriteSchema = "true"))