apache-sparkhadoopairflowdata-lake

Tool for storing infromation about tables, their sources and ETL for DWH


I'm searching for tool for storing documentation about tables, datasources, etl processes and etc for my DWH. I've seen some presentations on youtube, but I've found out, that most of the companies are using custom, own system or something like wiki ith plain text descriptions. I think, that it is not so useful for Analysts, Mangers and other user to find out , what they need and how to use data to calculate suitable for them statistics. Can you suggest, please, what may I use for this case? What I must read?


Solution

  • While Airflow was baked with some support for Apache-Atlas, in my opinion