amazon-web-servicesamazon-s3aws-lambdaaws-dmsaws-data-pipeline

Data migration from S3 to RDS


I am working on a requirement, where i am doing multipart upload of the csv file from on prem server to S3 Bucket.

To achieve this using AWS Lambda I create a presigned url and use this url i am uploading the csv file. Now, once i have the file in AWS S3, i want it to be moved to AWS RDS Oracle DB. Initially i was planning to use AWS Lambda for this.

So once i have the file in S3, it triggers lambda(s3 event) and lambda will push this file to RDS. But with this the issue is with the file Size(600 MB).

I am looking for some other way, where whenever there is a file uploaded to S3, it should trigger any AWS service and that service will push this csv file to RDS. I have gone through AWS DMS/Data Pipeline, but not able to find any way to automate this migration

I need to automate this migration on every s3 upload, that is also cost effective.


Solution

  • Setup S3 Integration and build SPROCS to help automate load. Details found here.

    UPDATE:

    Looks like you don't even need to create a SPROC. You can just use the RDS procedure as outlined here. You would then just create an event-driven lambda function that is triggered on a given S3 event--e.g. on object PUT(), POST(), COPY, etc..--which passes the S3 metadata requisite to access the event object. Here is a simple Python example of what that Lambda and config might look like. You would then use the metadata passed on the trigger event--as outlined in the Python example--to dynamically create your procedure call then execute that procedure. You can also add the ensuing workflow logic that meets your requirements--i.e. TASK_ID fetch & operational handling, monitoring, etc...--to the same lambda function or separate those concerns by adding additional lambdas. Hope this helps!