batch-processingazure-batchazure-machine-learning-serviceazuremlsdk

Unable to get image details : Environment version Autosave_(date)T(time)Z_******** provided in request doesn't match environ


On AzureML Batchendpoint, I'm recently hitting the following error:

Unable to get image details : Environment version Autosave_(date)T(time)Z_******** provided in request doesn't match environ.

when I setup the batch-endpoint with a yml config:

environment: azureml:env-name:env-version

So, AzureML creates and builds the environment with the version I specify env-version, which is just a number (in my case = 3).

and then for some weird reason, AzureML creates an extra environment version called Autosave_(date)T(time)Z_********, which is not built, but based on the previous one just created, and then it becomes the latest version of that environment.

In summary, AzureML instead of looking for the version that I specified as env-name:3 it seems to be looking for env-name:Autosave_(date)T(time)Z_******** and then throws the error message mentioned above.


Solution

  • I found the problem was that when creating an environment from a YAML specification file, one of my conda dependencies was cmake, which I needed to allow installation of another python module. The docker image is exactly the same as a previously created environment.

    Removing the cmake dependency from the YAML file, eliminated the issue. So the workaround is to install it using a Dockerfile.

    The error message was very misleading to start with, but got there in the end after understanding that AzureML reuses a cached image, based on the hash value, from the environment definition accordingly to this

    So for that reason, the automatically created Autosave docker image references to that same build, which only happens once when the first job is sent.