dockeramazon-sagemakerenv-file

Is there a way to pass arguments to our own docker container in sagemaker?


I am trying to train my model using Bring your own container technique in sagemaker. My model training runs correctly without any issues locally. But my docker image takes env-file as an input that could change at different runs. But in sagemaker when passing the ECR image, I don't know how to pass this env-file. So instead, inside the train script, which is called by the sagemaker, I added export KEY=value statements to create my variables. Even that did not expose my variables. Another way I tried it was by executing RUN source file.env while building my image. Even this approach did not work out as I got an error /bin/sh: 1: source: not found.

I could try ENV while building my image and that would probably work but this approach won't be flexible as my variables could change at different runs. Is there any way to pass docker run arguments from a sagemaker estimator or notebook? I checked out the documentation but I couldn't find anything.


Solution

  • I've been passing environment variables along with the Docker image URL when creating the Training job using the SageMaker Python SDK. Documentation of the train method states that:

    environment (dict[str, str]) : Environment variables to be set for
                use during training job (default: ``None``): 
    

    For reference, the SDK source.

    Because the SDK is a wrapper on top of Boto3, I'm pretty sure that the same can be implemented with Boto3 alone, and that there is an equivalent for every other Amazon Services SDK.