[SOLVED] Permanently restore Glacier to S3

Permanently restore Glacier to S3

I'm wondering whether there is an easy way to permanently restore Glacier objects to S3. It seems that you can restore Glacier objects for the certain amount of time you provide when restoring to S3. So for example, we have now thousands of files restored to S3 that will get back to Glacier in 90 days but we do not want them back in Glacier.

Solution

First, restore from Glacier (as you have done). This makes the file available so that you can copy it.

Then, once the file is available, you can copy/overwrite it using the AWS CLI:

aws s3 cp --metadata-directive "COPY" --storage-class "STANDARD" s3://my-bucket/my-image.png s3://my-bucket/my-image.png

Notes

In the above command:

The from and the to file paths are the same (we are overwriting it).
We are setting --metadata-directive "COPY". This tells cp to copy the metadata along with the file contents (documentation here).
We are setting the --storage-class "STANDARD". This tells cp to use the STANDARD s3 storage class for the new file (documentation here).
The result is a new file, this will update the modified date.
If you are using versioning, you may need to make additional considerations.

This procedure is based on the info from the AWS docs here.

Bulk

If you want to do it in bulk (over many files/objects), you can use the below commands:

Dry Run

This command will list the Glacier files at the passed bucket and prefix:

aws s3api list-objects --bucket my-bucket --prefix some/path --query 'Contents[?StorageClass==`GLACIER`][Key]' --output text | xargs -I {} echo 'Would be copying {} to {}'

Bulk Upgrade

Once you are comfortable with the list of files that will be upgraded, run the below command to upgrade them.

Before running, make sure that the bucket and prefix match what you were using in the dry run. Also make sure that you've already run the standard S3/Glacier "restore" operation on all of the files (as described above).

This combines the single file/object upgrade command with the list-objects command in the dry run using xargs.

aws s3api list-objects --bucket my-bucket --prefix some/path --query 'Contents[?StorageClass==`GLACIER`][Key]' --output text | xargs -I {} aws s3 cp --metadata-directive "COPY" --storage-class "STANDARD" s3://my-bucket/{} s3://my-bucket/{}