apache-sparkdatabricksazure-databricksazure-synapsedelta-lake

What is the maximum number of days for which we can keep versions in a Delta table?


What is the maximum number of days for which we can keep versions in a Delta table ?

I know by default Delta table keeps 7 days of versions. However, My team wants to keep all the history versions of a delta table (something like versions till last 999years).

I understand there would be Storage cost and performance considerations, but we are ready bear the required costs. It's a small sized delta table (with 1 million records in current version).

I tried to find this info in databricks documentation, but I couldn't find anything relevant.

I just want to know if it is possible to have 999years of version history in a Delta table.


Solution

  • You can configure the retention period for Delta table versions using the delta.logRetentionDuration configuration property.

    As per the Retrieve Delta table history Table, history retention is 30 days by default.

    You can use the history command to retrieve information about each write operation to a Delta table, including details such as the operations performed, the user who made the changes, and the timestamp of each operation.

    As you mentioned, you want the versions to last 999 years.

    I have tried the following approach to set the retention period by adjusting the delta.logRetentionDuration property to a value in days:

    %sql
    set delta.deletedFileRetentionDuration = "365000"
    

    Results:

    Enter image description here