pythonpysparkdatabricksdelta-lake

Truncate delta table in Databricks using python


Delta table delete operation is given here for Python and SQL, and truncate using SQL is given here. But I cannot find the documentation for Python truncate table.

How to do it for delta table in Databricks?


Solution

  • Not everything is exposed as a function for Python or Java/Scala. Some operations are SQL-only, like OPTIMIZE for example. If you want to truncate table, you have two choices:

    1. Use
    spark.sql("TRUNCATE TABLE <name>")
    

    or

    spark.sql("TRUNCATE TABLE delta.`<path>`")
    
    1. Emulate truncate with read + write empty dataframe in overwrite mode:
    df = spark.read.format("delta").load("<path>")
    df.limit(0).write.mode("overwrite").format("delta").save("<path>")