I have two tables in a data warehouse: balances
and dates
. Balances
has the following structure:
Surrogate Key Date | Date | Account | Balance
1 | 2017-02-01 | 100 | 1234
1 | 2017-02-01 | 200 | 5151
2 | 2017-02-02 | 100 | 5123
2 | 2017-02-02 | 200 | 8234
And dates
has the following structure:
Surrogate Key Date | Date | Weekday | Week in Year | ... other columns
1 | 2017-02-01 | Wed | 5 |
2 | 2017-02-02 | Thu | 5 |
The Surrogate Key Date
column is type INT, and the Date
column is type DATE in both tables.
The surrogate key in the balances
table is used in OLAP queries, and the date is used for regular reports.
Now, I need to develop a program that uses the database intensively (it's a batch process) and it needs to access repeatedly the balance table through the date column. Should I use in this process the Surrogate Key Date column or the Date column? I need to filter by date. Is the INT access in the where clause more efficient than the DATE access? Should I ignore the Surrogate Key Date
column when not using OLAP?
From a performance point of view, it likely doesn't matter at all on SAP HANA. Generally speaking, HANA doesn't work on the actual data types when calculating joins and result sets but with internal reference IDs.
Actually, I don't see a reason at all for having a surrogate key in the first place when it maps 1:1 to the Date
column. Also, as it seems to allow duplicates it's not a key at all. Not sure, how you are supposed to address specific records from the same date in this model.