I need to delete certain data from a delta-lake table before I load it. I am able to delete the data from delta table if it exists but it fails when the table does not exist.
Databricks scala code below
// create delete statement
val del_ID = "Check_ID =" + "123"
// get delta table from path where data exists
val deltaTable = DeltaTable.forPath(spark, path)
// delete data from delta table
deltaTable.delete(del_ID)
The above code works only if the delta data exists on that path otherwise it fails.
Can someone share an approach where the delete statement is executed if the delta data exists else the delete statement is ignored ?
According to the DeltaTable's Javadoc, you can check that there is a delta table in specified path with the following command:
DeltaTable.isDeltaTable(spark, "path/to/table")
If the path does not contain delta table or doesn't exist, it will return false. So your code would be:
val del_ID = "Check_ID ="+ "123"
if (DeltaTable.isDeltaTable(spark, path)) {
DeltaTable.forPath(spark, path).delete(del_ID)
}