amazon-s3google-bigquerygoogle-cloud-storage

Exporting data from Google Cloud Storage to Amazon S3


I would like to transfer data from a table in BigQuery, into another one in Redshift. My planned data flow is as follows:

BigQuery -> Google Cloud Storage -> Amazon S3 -> Redshift

I know about Google Cloud Storage Transfer Service, but I'm not sure it can help me. From Google Cloud documentation:

Cloud Storage Transfer Service

This page describes Cloud Storage Transfer Service, which you can use to quickly import online data into Google Cloud Storage.

I understand that this service can be used to import data into Google Cloud Storage and not to export from it.

Is there a way I can export data from Google Cloud Storage to Amazon S3?


Solution

  • You can use gsutil to copy data from a Google Cloud Storage bucket to an Amazon bucket, using a command such as:

    gsutil -m rsync -rd gs://your-gcs-bucket s3://your-s3-bucket
    

    Note that the -d option above will cause gsutil rsync to delete objects from your S3 bucket that aren't present in your GCS bucket (in addition to adding new objects). You can leave off that option if you just want to add new objects from your GCS to your S3 bucket.