We're switching over our scripts from using gsutil
to the reportedly faster gcloud storage
. However we access a significant amount of public data, for example from gs://gcp-public-data--broad-references
.
We do NOT want to pay to download this public data. However it appears that gcloud storage
is automatically setting the X-Goog-User-Project
header for public transfers while gsutil
does not.
Is my understanding of the various documentation correct that glcoud storage
is instructing GCS to bill us and not the public bucket for transfers?
gcloud version
Google Cloud SDK 407.0.0
and gsutil 5.15
gcloud init
gcloud config list
gsutil -d ls gs://gcp-public-data--broad-references
Headers:
do NOT contain X-Goog-User-Project
gcloud --log-http storage ls gs://gcp-public-data--broad-references
== headers start ==
your default project has been included as the X-Goog-User-Project
According to all the documentation I've been able to find one should not set that header by default.
Via https://cloud.google.com/storage/docs/requester-pays:
Important: Buckets that have Requester Pays disabled still accept requests that include a billing project, and charges are applied to the billing project supplied in the request. Consider any billing implications prior to including a billing project in all of your requests.
Via https://cloud.google.com/storage/docs/xml-api/reference-headers#xgooguserproject:
The project specified in the header is billed for charges associated with the request. This header is used, for example, when making requests to buckets that have Requester Pays enabled.
Bonus:
gsutil ls gs://gnomad-public-requester-pays
BadRequestException: 400 Bucket is a requester pays bucket but no user project provided.
gcloud storage ls gs://gnomad-public-requester-pays
The latter above doesn't seem correct to me as I never intentionally told gcloud storage
which project to bill for the request.
Update: This behavior seems to have been fixed as of the Google Cloud SDK 411.0.0 released 2022-12-06. As of that version running the setup specified in the original question no longer sends the X-Goog-User-Project
header.
Thanks @carbocation for the heads up about the fix!
Heard back from a support member after this was reposted to the Google Cloud Community Forums.
The default behavior of the Cloud CLI gcloud is to use the current project for all quota and billing operations. This is why you automatically see your project ID passed in X-Goog-User-Project. This behavior can be overridden though by adding the global --billing-project flag to any command.
If you set this flag to an empty string, no project is passed in the request. I tested this with gcloud storage and confirmed that requester pays buckets return the expected error message (“400: Bucket is a requester pays bucket but no user project provided.”). Non-requester pays buckets allow operations as well.