I am trying to create data assest with ADLS gen 2, and read a delta table on adls gen folder something like this:
/
└── my-data
├── _delta_log
├── part-0000-xxx.parquet
└── part-0001-xxx.parquet
Currently, when creating the data asset I used file dataset type ML v1 APIs, but when reading the table, it shows all the rows(even the deleted ones), and not the most recent version.
I have attempted to create it all the other data asset types for azure Ml v1/v2. I ideally want to read the most recent version of the delta table and also have the option to change version.
No sucess. How to resolve this?
For the below code to work, you need to create a mltable(data asset) with correct folder path.
import time
import mltable
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
current_timestamp = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
ml_client = MLClient.from_config(credential=DefaultAzureCredential())
data_asset = ml_client.data.get("<enter your ml table name>", version="1")
tbl = mltable.from_delta_lake(delta_table_uri=data_asset.path,
timestamp_as_of=current_timestamp)
df = tbl.to_pandas_dataframe()
df