We are creating a Federated Datawarehouse using snowflake i,e i will have dedicated DWH on each specific region say 3 regions. I will have one global DWH in a separate region for which we need to take data from tables from the other region DWH for reporting. What would be the best approach to accomplish the same?
I read and understood that, you can unload data from DWH in a region into AWS S3 or AZURE Blob on the same region. This i have to do it for all 3 regions. Then we have an option in AWS S3 cross-region replication which i can enable and then load it into Global DWH.
This was my approach and seems bit long and might cost extra for cross region data transfer which anyway is required. Mainly i will not be able to create a flow end to end. Since all are in different region, i need to run a separate job to unload it to s3 in that region, need to validate and need to start loading once all 3 unload completes. Workflow or orchestration also a problem. I considered AWS Batch and Step function but both regional services.
Appreciate if someone can through some light and options? Thank you!
I wouldn't advise doing it with S3 and loading into each database.
You have two options with Snowflake which are much more suited to your use case. One which Rich has already mentioned (Database Replication). You also have the option to use Data Sharing via the Private Data Exchange (not available in all regions yet) or by using Data Sharing