azure-container-registryazureml-python-sdk

How to download a data asset from Azure registry


As the question in title

I can use azure.ai.ml.MLClient to connect with an azure registry. In this registry, there is a data asset (container) named as train, and there are a few jsonl files saved in this container.

How can I download all files from this container to my local machine using azure python SDK?

Thank you very much!


Solution

  • Below is the data i am having in my workspace.

    enter image description here

    You can use below code for downloading it.

    from azure.ai.ml import MLClient
    from azureml.core import Workspace
    import azure.ai.ml._artifacts._artifact_utilities as artifact_utils
    from azure.identity import DefaultAzureCredential,InteractiveBrowserCredential
    
    subscription_id = "your_subscription_id"
    resource_group = "resource_group_name"
    workspace = "workspace_name"
    ml_client = MLClient(
        InteractiveBrowserCredential(), subscription_id, resource_group, workspace
    )
    
    data_info = ml_client.data.get(name="train", version="1")
    artifact_utils.download_artifact_from_aml_uri(uri = data_info.path, destination = "./mlasset/", datastore_operation=ml_client.datastores)
    

    Output:

    enter image description here

    You can also list down the data in your workspace.

    
    for  i  in  ml_client.data.list():
        print(f"Name:{i.name}")
        print(f"Path :{i.path}")
        print(f"version:{i.latest_version}")
    

    enter image description here