I would like to parse a text file on my blob storage container, but I get an error message which says: permission denied when access stream.
I guess I have to use an Access keys for my storage account, but how can I do this in the code?
My program works when using a public blob container from microsoft learn.
# Auth
from azure.identity import DefaultAzureCredential
from azure.ai.ml.entities import AmlCompute
from azure.ai.ml import UserIdentityConfiguration
from azure.ai.ml import MLClient, command, Input
from azure.ai.ml.constants import AssetTypes, InputOutputModes
import pandas as pd
from azure.storage.blob import BlobServiceClient
ml_client = MLClient.from_config(credential=DefaultAzureCredential())
gpu_compute_target = "gpu-cluster"
curated_env_name = "TensorFlow_Train_Env:2"
blob_csv_path = "wasbs://mycontainername@mystorageaccount.blob.core.windows.net/read1.txt"
job = command(
inputs={
"csv_file" : Input(type=AssetTypes.URI_FILE, path=blob_csv_path, mode=InputOutputModes.RO_MOUNT),
},
compute=gpu_compute_target,
environment=curated_env_name,
code="./src/",
command="python test2.py --data-file ${{inputs.csv_file}}",
experiment_name="tf-test-expname",
display_name="tensorflow-test_displayname",
)
ml_client.jobs.create_or_update(job)
To read a CSV from blob, you can also give a path using either the https://
protocol or the azureml://
protocol.
First, create a datastore using the storage account where your CSV is located.
Go to Data > Datastores > Create.
Next, give a name to the datastore and provide the access key.
Click on create.
After creating, browse to your CSV file.
and click copy uri. You will be prompted with 2 kinds of URIs.
URI
Select any one of them and pass that path to your job. This datastore can be used for any data present in your blob storage, with no need to configure every time.