I have been using the following code to determine the latest table using Databricks TimeTravel feature for the past few years without any issues. I recently added a new row to the table that I have been using the code on but now I'm getting the error:
AnalysisException: Cannot time travel Delta table to version 1. Available versions: [3, 23].
This is very strange that this should be happening now.
The code is as follows:
from delta.tables import DeltaTable import pyspark.sql.functions
dt = DeltaTable.forPath(spark, saveloc)
latest_version = int(dt.history().select(max(col("version"))).collect()[0][0])
lastest_table_dropped = spark.read.format("delta").option("versionAsof", latest_version).load(saveloc).createOrReplaceTempView('maxversion')
start_table_dropped = spark.read.format("delta").option("versionAsof", 1).load(saveloc).createOrReplaceTempView('allprior')
I appreciate the that it has been determined by Databricks that the latest version is 3, but I don't understand where it's now not possible to use the latest version of 1?
My dt history is as follows:
If predictive optimization was enabled on your account, then vacuums could be happening automatically. You can check the system table to see what vacuums are occurring due to predictive optimization:
SELECT *
FROM system.storage.predictive_optimization_operations_history
WHERE operation_type = "VACUUM"