pythonazureazure-machine-learning-serviceazureml-python-sdk

Getting ModuleNotFoundError: No module named 'mltable' error while executing command job in Azure ml SKD v2


I am trying to run a command job that'd read in a Data Asset and perform some pre processing for further ML tasks.

To read in the data in the .py file I used the code in the consume section as below:

def get_data():

    ml_client = MLClient.from_config(credential=DefaultAzureCredential())
    data_asset = ml_client.data.get("--data--", version="2")

    tbl = mltable.load(f'azureml:/{data_asset.id}')

    return tbl.to_pandas_dataframe()

For above code to work, have used mltable import as below:

import mltable

When I run the above command in my .ipynb notebook cell, it works fine.

But then I try to execute the above py file using a command job:

from azure.ai.ml import command

configure job

job = command(
    code="./folder",
    command="python --script--.py",
    environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest",
    compute="--computename--",
    display_name="....",
    experiment_name="...."
    )

and when I execute it, I am getting:

No module named 'mltable'

I am working on Python 3.10 - SDK v2

Can someone please help me with this.


Solution

  • Yes, you need to include environment instead of giving environment as AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest

    you create custom environment with conda file, something like below.

    env_docker_conda = Environment(
        image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
        conda_file="conda-yamls/pydata.yml",
        name="docker-image-plus-conda-example",
        description="Environment created from a Docker image plus Conda environment.",
    )
    ml_client.environments.create_or_update(env_docker_conda)
    

    In .yml file you specify the required packages you need and then pass env_docker_conda to you command job.

    Refer this GitHub samples for more about creating environment.