I'm creating a DLT pipeline using the medallion architecture. In Silver, I used CDC/SCD1 to take the latest id by date which is working fine but I had a question on the @dlt.view wrapper.
My current pipeline looks like this:
BRONZE
dlt.create_table(xxx)
def bronze_table():
return(spark.readStream.transform(transformation_function))
SILVER Here as per CDC documentation, I need to create a view as CDC is not supported for streaming tables: https://docs.databricks.com/en/delta-live-tables/cdc.html
@dlt.view
df view():
return dlt.readStream("bronze_table)
dlt.create_streaming_table("target")
dlt.apply_changes(
xyz
)
My question is, is the view I'm creating a static view or a materialized view? In the DLT Pipeline UI it says it's just a view. However, I want this to be a materialized view as I want latency to be as fast as possible and make use of Delta Live Tables wherever it can to optimize latency.
If I am creating just a static view - what syntax do I need to apply to create a materialized view instead? I tried dlt.table instead but that just creates a streaming table. Many thanks
In Python, Delta Live Tables determines whether to update a dataset as a materialized view or streaming table based on the defining query.
The @table decorator is used to define both materialized views and streaming tables.
To define a materialized view in Python, apply @table to a query that performs a static read against a data source.
To define a streaming table, apply @table to a query that performs a streaming read against a data source.
Read more here
https://docs.databricks.com/en/delta-live-tables/python-ref.html#import-the-dlt-python-module