kubernetesgoogle-cloud-mlgsutilgoogle-cloud-ai

Authenticating standalone gsutil in containers in Cloud ML Engine on Kubernetes with Workload Identity


I'm launching container images on Google Cloud AI Training (Cloud ML Engine)

Inside those containers I need to use gsutil. Some containers have gsutil. In that case I can use it right away without any authentication steps.

Some containers do not have gsutil, so I have to install it. The problem is that the installed gsutil does not work.

When I'm using the official cloud-sdk image, gsutil works without any auth steps.

When I use the python:3.7 image and install gsutil from PyPI it does not work:

python -m pip install gsutil --quiet
gsutil cp a gs://b/c

ServiceException: 401 Anonymous caller does not have storage.objects.get access to ...

How can I make it so that the standalone gsutil obtains the needed credentials?

Most guides focus on manually calling gcloud auth, copying URL and copying back the token. This is not the solution that I seek (which should be automated). I know that the automated solution is possible since in some images gsutil works out of the box.


Solution

  • This is because that pip install gsutil alone does not configure the credentials, which is why it's anonymous user as the error says. You'll want to configure credentials to access protected data.

    Put following line in your docker file and it should work:

    RUN echo '[GoogleCompute]\nservice_account = default' > /etc/boto.cfg

    It's to configure gsutil to use the default service account.