azurescalaapache-sparkdatabricks

spark sql write.parquet overwrite issue


I am using spark scala jar application on Databricks runtime version is 13.3 LTS, scala version 2.12, spark 3.4.1 and in my application I have a line like below

incomingDF.write
  .mode("overwrite")
  .option("overwriteSchema", "true")
  .partitionBy("partition_date")
  .parquet(tablePath)

This line of code couldn't achieve it's purpose while running as jar file it didn't throw an error and finises as successfullt. It couldn't update the target tablePath.

But whenever I ran exactly same code on above inside datarbricks notebook. It runs successfully and update the tablePath. I can not find the solution. Can you please help me on that one?


Solution

  • After talking with Databricks support I learned that there is a parameter called. spark.sql.sources.partitionOverwriteMode which have two parameter dynamic and static. If you set dynamic it will just update the partition which have new data in your new dataframe. But if you set static it will update the whole folder inside folder path.