I am working on migrating the SQL Server Databases to Azure Data Warehouse. I had dimension tables in earlier DB which had dim ids (basically integer values) and those were referred in fact tables for easy data fetch. These dim ids were generated using identity columns.
Now, in Azure Data Warehouse, the identity column values are randomly generated and I am not sure whether to use these dim ids here. If these ids are not used then text columns can be used, but that will increase the seek time in turn hitting the performance.
So, can anyone suggest how I should handle these dim ids in Azure Data Warehouse?
Regards,
Pratik
You're fine using identity columns in Azure SQL Data Warehouse. Those values are not "random", they're just not sequenced the same way that you'd see on an SMP version of SQL Server.
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity
If you'd prefer an alternative, you can use a calculation like this to assign a surrogate key during row insertion:
cast((select max(dim_sk) from dw.dim_xxx) + row_number() over (order by getdate()) as bigint) as dim_sk