python-3.xtensorflowamazon-ec2amazon-sagemakerdocker-pull

SageMaker deploying to EIA from TF Script Mode Python3


I've fitted a Tensorflow Estimator in SageMaker using Script Mode with framework_version='1.12.0' and python_version='py3', using a GPU instance.

Calling deploy directly on this estimator works if I select deployment instance type as GPU as well. However, if I select a CPU instance type and/or try to add an accelerator, it fails with an error that docker cannot find a corresponding image to pull.

Anybody know how to train a py3 model on a GPU with Script Mode and then deploy to a CPU+EIA instance?


I've found a partial workaround by taking the intermediate step of creating a TensorFlowModel from the estimator's training artifacts and then deploying from the model, but this does not seem to support python 3 (again, doesn't find a corresponding container). If I switch to python_version='py2', it will find the container, but fail to pass health checks because all my code is for python 3.


Solution

  • Unfortunately there are no TF + Python 3 + EI serving images at this time. If you would like to use TF + EI, you'll need to make sure your code is compatible with Python 2.

    Edit: after I originally wrote this, support for TF + Python 3 + EI has been released. At the time of this writing, I believe TF 1.12.0, 1.13.1, and 1.14.0 all have Python 3 + EI support. For the full list, see https://github.com/aws/sagemaker-python-sdk#tensorflow-sagemaker-estimators.