I was downloading a file using awscli:
$ aws s3 cp s3://mybucket/myfile myfile
But the download was interrupted (computer went to sleep). How can I continue the download? S3 supports the Range header, but awscli s3 cp
doesn't let me specify it.
The file is not publicly accessible so I can't use curl to specify the header manually.
There is a "hidden" command in the awscli tool which allows lower level access to S3: s3api
.† It is less user friendly (no s3:// URLs and no progress bar) but it does support the range specifier on get-object
:
--range (string) Downloads the specified range bytes of an object. For
more information about the HTTP range header, go to
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.
Here's how to continue the download:
# GNU vs BSD stat have different CLI for getting the size of a file in bytes.
$ if stat -f%z /dev/null &>/dev/null ; then arg="-f%z" ; else arg="--format %s" ; fi
$ size=$(stat $arg myfile)
$ aws s3api get-object \
--bucket mybucket \
--key myfile \
--range "bytes=$size-" \
/dev/fd/3 3>>myfile
You can use pv for a rudimentary progress bar:
$ aws s3api get-object \
--bucket mybucket \
--key myfile \
--range "bytes=$size-" \
/dev/fd/3 3>&1 >&2 | pv >> myfile
(The reason for this unnamed pipe rigmarole is that s3api writes a debug message to stdout at the end of the operation, polluting your file. This solution rebinds stdout to stderr and frees up the pipe for regular file contents through an alias. The version without pv
could technically write to stderr (/dev/fd/2
and 2>
), but if an error occurs s3api writes to stderr, which would then get appended to your file. Thus, it is safer to use a dedicated pipe there, as well.)
† In git speak, s3
is porcelain, and s3api
is plumbing.