We have multiple source systems sending data. Ideally we should capture the raw data coming from sources and keep it in data lake. Then we have to process the raw data into a structured format. Now users can update this data via a front end application.
I am thinking of putting a rdbms on top of processed data and then pull the audit trails from rdbms to data lake and merge processed data and audit trails to create the final view for reporting. Or the rdbms can also be used for analytics as well.
Or we can bring in all the data originally in rdbms and run the changes in rdbms and pull data from rdbms into data lake. But this doesn't make much sense to bring in data lake.
Kindly suggest.
Thanks,
ADLA is NOT consumer oriented, meaning you would not connect a front-end system to it. If the question is "what should we do", I'm not sure anyone can answer that for you, but it sounds like you are on the right track.
What I can do is tell you what we do:
Pulling the data into Data Lake first and RDBMS second makes sense to me.