ssismetadataetlinformaticaab-initio

"Metadata driven" means what? I keep hearing this phrase in ETL context but could never figure it out


Appologies if I am asking a inappropraite question but I have been hearing this phrase "Metadata driven" for years but could not ever understand.

Metadata as per my understanding is Data (iformation) about data! I understand this more or less!!

But when I hear "MetaData driven" (specaily in ETL world) could not figure it out exactly what it means.

I have good experience with one ETL tool SSIS, so example in it's context will be easy to unsersatnd.


Solution

  • Assume you are moving 5 rows from table A to table B and you would like to make sure that only the rows matching a particular criteria are affected. In this case your process depends on data and is, therefore, an example of a data-driven design.

    Now, let's imagine you have a few "similar" source and/or target table schemas which are similar in the way you would like to process them but are different in their exact implementation (table name, column names, column data types, or even a DB type: Oracle, MS SQL, Sybase, even a flat file or an XML) so what you would like is to "plug-in" sources and targets, DB connections, etc for a particular ETL during the actual run of the ETL.

    What you need is a clear separation of the "logical" ETL process from a "physical" implementation. In other words you would like to have an ETL being described in a generic logical units/terms which are substituted by actual physical ones during its run.

    What you get then is a descrption of an ETL process that is generic enough for any situation and gets a proper customization to be run for specific source/target systems based on metadata of those sources and targets - a metada-driven design, which allows you to have a generic "logical" representation of your ETL process that becomes a "physical instantiation" at a run-time.