structureetldata-modelingdbtfishtown-analytics

DBT - best practices to create staging views for the final business model


I am working on a business model/view, whose raw SQL definition contains a very complex query on the source table.

My question here is - that I need to create multiple staging models on the source table as there are different specific select queries on the same source(which cannot be taken care of in a single staging model on the source). What should be the best practice to deal such scenario.

One way is to create all the different staging models directly on source and use them in my final business model.

Second way is to create one staging model on the source that will have all fields from the source (and no transformation required in my particular case), and then use this staging model to create all the other intermediate staging models with the specific sql query.

Let me know if there is any other better way for this.


Solution

  • I appreciate the thought you’re putting into it! My answer is really more opinion than anything.

    I think you should follow the practice of one staging model per source. Staging models are just for translating and light cleaning. They most come in handy when the source data changes. In that scenario, you’ll only need to change the staging model and not have to touch any downstream models.

    So one staging model for your source in question, and (obviously) one final table for the end-users. As for what’s in b/w, if whatever you like! You can choose to have:

    There’s also many methods to hide your intermediary models as well from your end user.

    You just make whatever makes sense for you and your users across the following dimensions: