Looking for inputs on a data warehouse schema design. Here is the scenario:
I have an Action Table and User Table that is currently joined based on UserId to get the details about the user who did an action.
Action Table:
UserId Action
123 Test001
User Table:
UserId UserName
123 Adam
Now, we have to migrate the users to a new user management system (UMS) and here is how it works:
Now, when we do reporting, we have to expose both historic and new action data. Wondering what should be the ideal schema design so we can report both historic & new actions and map them to the right user.
Platform: SQL Server 2016, Analysis Services
Please let me know if you need more details.
Agreed with the previous answer. When your upstream team performs the migration to UMS, they should preserve in some way a mapping between Legacy User IDs and Modern User IDs. Downstream in the warehouse, I suggest that you keep both IDs in your User dimension table but generate a surrogate key in this table which will serve as a primary key (it can be just an incremental integer). This way you can use the surrogate key as a foreign key in your Action fact table whether the User is a modern or legacy one.
Here is my proposal of data model design for your tables:
DIM_USER
- USER_KEY (pk)
- USER_ID
- USER_ID_LEGACY
- USERNAME
- ....
DIM_ACTION
- ACTION_KEY (pk)
- ACTION
- ....
FACT_ACTION
- USER_KEY
- ACTION_KEY
- ....