I've fitted a Tensorflow Estimator in SageMaker using Script Mode with framework_version='1.12.0'
and python_version='py3'
, using a GPU instance.
Calling deploy directly on this estimator works if I select deployment instance type as GPU as well. However, if I select a CPU instance type and/or try to add an accelerator, it fails with an error that docker cannot find a corresponding image to pull.
Anybody know how to train a py3 model on a GPU with Script Mode and then deploy to a CPU+EIA instance?
I've found a partial workaround by taking the intermediate step of creating a TensorFlowModel from the estimator's training artifacts and then deploying from the model, but this does not seem to support python 3 (again, doesn't find a corresponding container). If I switch to python_version='py2', it will find the container, but fail to pass health checks because all my code is for python 3.
Unfortunately there are no TF + Python 3 + EI serving images at this time. If you would like to use TF + EI, you'll need to make sure your code is compatible with Python 2.
Edit: after I originally wrote this, support for TF + Python 3 + EI has been released. At the time of this writing, I believe TF 1.12.0, 1.13.1, and 1.14.0 all have Python 3 + EI support. For the full list, see https://github.com/aws/sagemaker-python-sdk#tensorflow-sagemaker-estimators.