I have a a Rails app currently using ActveRecord to attach many images to a few different models via has_many_attached :images
. The app is beginning to outgrow the storage of these images in the local ./storage
directory so I've begun the process of migrating to an S3 compatible service.
New records with attached images
are now saving to the new S3 service and the images are being recalled correctly when viewing the records. I used s3cmd
to sync all files from the local directory to the new S3 service, which seemed to complete successfully, and renamed the ./storage
directory to ./storage.bak
for testing.
All preexisting records fail to fetch their associated images from the S3 service now though. I've issued a ActiveStorage::Blob.update_all(service_name: 'digitalocean_s3')
from the console, which did update all records' service_name
, but the images still fail to load. Interestingly, When clicking on a link to one of these images Digital Ocean (host of the S3 compatible bucket) displays the following (0's and x's replaced by me):
<Error>
<Code>NoSuchKey</Code>
<Message/>
<BucketName>attachments-production</BucketName>
<RequestId>tx000000000000000000000-0000000000-00000000-nyc3d</RequestId>
<HostId>xxxxxxxx-nyc3d-nyc3-xxxx</HostId>
</Error>
I'm guessing I need to alter something else about these Blobs, but what? Am I missing something else?
Actually, the placement of the attachment files seems to be different when using the local ./storage
directory than it is when using an S3 bucket. As stated in my reply above, when stored in the local storage directory ActiveStorage is using a subdirectory structure like this:
./storage/va/j4/vaj4us85jdi33jdiw48
But, when configured to use an S3 bucket, it stores all files directly in the root of the bucket with no simulated directory structure, like this:
/vaj4us85jdi33jdiw48
So the trick was to get all of these 500,000+ attachments from many thousand of subdirectories to the root of the S3 bucket. I used s3cmd
in conjunction with find
to accomplish this using the one-liner below:
find /app_dir/storage -type f ! -path "*/variants/*" -exec bash -c 's3cmd sync --preserve "{}" "s3://bucket-name/$(basename "{}")"' \;
This finds all files, regardless of subdirectory, then fires up the s3cmd sync
command for each individually and drops them into the root of the bucket. I'm choosing to exclude the ./storage/variants
directory here, I'm fine with those getting regenerated later.
There are probably more efficient ways of doing this, but I was able to copy 300 GB of attachments in about 60 hours using this method, which was fast enough for this project.