I'm trying to setup a workflow in AWS.
An S3 bucket contains the following folders:
mybucket/todo
mybucket/wip <- work in progress
mybucket/done
Another task dumps files into the 'todo' folder for processing.
An Ubuntu EC2 instance has the bucket mounted via s3fs-fuse and inotifywait
is being used to watch the 'todo' folder for new files.
If I perform a touch /mybucket/todo
from within the EC2 instance, the inotifywait
job is triggered. However, if a file is uploaded to the S3 folder from another source, the job does not get triggered.
Does this seem like a sensible design? If so, can you see where I went wrong? Or should I just use cron
?
The short answer, probably not.
Even though you can use s3fs to mount a volume to your filesystem, it ends up faking some of the features of a traditional block volume. Being that it is an object storage system and not a block device. Files have to be uploaded as a complete file and any block changes are not committed individually.
S3fs does not keep an up to date list of all objects in the bucket. It doesn't even know that new files exist in the bucket until you request a listing of files in the bucket. To do so, it has to send a REST API request.