I create spark dataframe from csv file and try to insert it to rdbms table having integer and varchar columns. Since my dataframe is all string type its failing in "append" mode. If i use overwrite mode, rdbms table will be recreated with all varchar columns. How can i insert data into rdbms table in append mode by handling dataype in the spark dataframe?
While reading the CSV you can either inferSchema or specify the schema programmatically.
val diamonds = sqlContext.read.format("csv")
.option("delimiter"," ").option("quote","")
.option("header", "true")
.option("inferSchema", "true")
.load("/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv");
import org.apache.spark.sql.types._
val customSchema = new StructType()
.add("_c0",IntegerType,true)
.add("carat",DoubleType,true)
.add("cut",StringType,true)
.add("color",StringType,true)
.add("clarity",StringType,true)
.add("depth",DoubleType,true)
.add("table",DoubleType,true)
.add("price",IntegerType,true)
.add("x",DoubleType,true)
.add("y",DoubleType,true)
.add("z",DoubleType,true)
val diamonds_with_schema = spark.read.format("csv")
.option("header", "true")
.schema(customSchema)
.load("/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv");
once you have this dataframe with the required schema then you can append to the existing table.
Please check: https://docs.databricks.com/data/data-sources/read-csv.html