azureazure-machine-learning-serviceazureml-python-sdk

AzureML command job: 'Your file exceeds 100 MB.'


I am trying to submit a job to AzureML using Python SDK V2. I am developing locally on VSCode. As my repository grew over time, I now get the message when submitting the job:

Your file exceeds 100 MB. If you experience low speeds, latency, or broken connections, we recommend using the AzCopyv10 tool for this file transfer.

A similar question has been posted a few months, although with no responses.

I added an .amlignore file, however, this didn't do much.

I couldn't find anything on how to use azcopy to submit the job to azureml. My repository is in a devops repository, if that information helps on finding a workaround.

Any help is highly appreciated.


Solution

  • The warning arises only for the first time when you submit the job; you won't get it again until you make changes to your source code, as the data is already present from a previous job submission.

    In that case, you can use azcopy manually and provide the storage account associated with your machine learning workspace while submitting the job.

    Here is the step:

    Log in using azcopy:

    azcopy login

    Or use the SAS URL as the destination path. Refer here for more information about authentication.

    I have used a SAS URL with read, write, list, and create permissions.

    Here is the command to copy:

    azcopy copy '<source_code_path>/*' 'https://<storage_account>.blob.core.windows.net/<container_name>/code/<sas_token>'
    

    enter image description here

    Provide the storage account associated with your ML workspace and the SAS token. Use the same storage account path while submitting the job.

    Job submission:

    command_job = command(
        code="https://jgsml0114685361.blob.core.windows.net/job1-data/code/",
        command="echo hi from command",
        environment="azureml:docker-context-example-10:1",
        timeout=10800
    )
    

    This method does not upload the code but reads it directly from there.