I have a model in dbt which uses different sources and ref tables. So my question is... Is it possible to create a macro that will extract all the ref and source tables from the model? Let me give you and example
I have this model:
select
*
from {{ ref('x') }} a
left join {{ source('landing', 'y') }} ma on a.... = ma....
left join {{ source('landing', 'z') }} b on a.... = b....
So, once the macro is run for that model, I would like to get the names or the ref and source tables. Is that possible? I would like to do this in dbt, but let me know if it would be better to use python instead.
Thanks!
-- my_macro.sql
{% macro get_dependencies(model_name) %}
{% set models = graph.nodes.values() %}
{% set model = (models | selectattr('name', 'equalto', model_name) | list).pop() %}
{% do log("sources: ", info=true) %}
{% set sources = model['sources'] %}
{% for source in sources %}
{% do log(source[0] ~ '.' ~ source[1], info=true) %}
{% endfor %}
{% do log("refs: ", info=true) %}
{% set refs = model['refs'] %}
{% for ref in refs %}
{% do log(ref[0], info=true) %}
{% endfor %}
{% endmacro %}
Once you add this macro, give it a try with this…
dbt run-operation get_dependencies --args '{model_name: my_model}'
and it will spit out…
sources:
landing.y
landing.z
refs:
x
The way I currently have it written, it will write out any repeating sources or models. e.g. if you have {{ ref('x') }}
in multiple locations, then it will write it out as many times.
refs:
x
x
x
I used the below documentaiton about the graph variable to piece this together.
https://docs.getdbt.com/reference/dbt-jinja-functions/graph
EDIT:
Here's a way to get a unique list of sources and refs. It's not the prettiest but it works. (There has to be better way to get a unique list!)
{% macro get_dependencies(model_name) %}
{% set models = graph.nodes.values() %}
{% set model = (models | selectattr('name', 'equalto', model_name) | list).pop() %}
{% set sources = model['sources'] %}
{% set refs = model['refs'] %}
{% do log("sources: ", info=true) %}
{% set unique_sources = [] %}
{% for source in sources if source not in unique_sources %}
{% do unique_sources.append(source) %}
{% do log(source[0] ~ '.' ~ source[1], info=true) %}
{% endfor %}
{% do log("refs: ", info=true) %}
{% set unique_refs = [] %}
{% for ref in refs if reference not in unique_refs %}
{% do unique_refs.append(ref) %}
{% do log(ref[0], info=true) %}
{% endfor %}
{% endmacro %}