I am trying to run a command job that'd read in a Data Asset and perform some pre processing for further ML tasks.
To read in the data in the .py
file I used the code in the consume
section as below:
def get_data():
ml_client = MLClient.from_config(credential=DefaultAzureCredential())
data_asset = ml_client.data.get("--data--", version="2")
tbl = mltable.load(f'azureml:/{data_asset.id}')
return tbl.to_pandas_dataframe()
For above code to work, have used mltable
import as below:
import mltable
When I run the above command in my .ipynb
notebook cell, it works fine.
But then I try to execute the above py
file using a command job:
from azure.ai.ml import command
job = command(
code="./folder",
command="python --script--.py",
environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest",
compute="--computename--",
display_name="....",
experiment_name="...."
)
and when I execute it, I am getting:
No module named 'mltable'
I am working on Python 3.10 - SDK v2
Can someone please help me with this.
Yes, you need to include environment instead of giving
environment as AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest
you create custom environment with conda file, something like below.
env_docker_conda = Environment(
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
conda_file="conda-yamls/pydata.yml",
name="docker-image-plus-conda-example",
description="Environment created from a Docker image plus Conda environment.",
)
ml_client.environments.create_or_update(env_docker_conda)
In .yml
file you specify the required packages you need and then pass env_docker_conda
to you command job.
Refer this GitHub samples for more about creating environment.