I need to create a data warehouse for analyzing data in an order process but I am unsure of how to proceed since my research led to conflicting information. For simplicity I will only use the most important tables to describe my issue.
TblOrder (Fact):
Once a customer makes an order it creates an entry in TblOrder.
Every entry is unique an has an OrderNoand it's own timestamp of creation
TblOrderSubProcess (Fact):
An order consists of at least 1 sub process.
Each sub process of an order is a unique entry in TblOrderSubProcess and has it's own timestamp of creation.
Each sub process contains has a column with the OrderNo
An entry will only be made if enough material is in the warehouse
Moving material between warehouses (for storage management) is also a sub process but has OrderNo = null
TblOrderInformation (Dimension):
Star schema being the best practise in Power BI, I am not supposed to create a relationship between TblOrder and TblOrderSubProcess but there will be visuals which need the relation between both tables.
My solution to this was to create a flat table where I TblOrder LEFT JOIN TblOrderSubProcess. This solution works but has it's flaws:
If I create a date slicer it will always be using the date of Order or OrderSubProcess. This would mean that my Power BI Reports can only contain information of either Order or OrderSubProcess
Joining the tables creates redundancies
The created flat table would be very large with initially 30 columns. The columns will grow over time.
My goal is a datamodel which solves the mentioned issues and follows best practises
This sounds very similar to the order header - order line data modeling problem. The kimball group wrote a blog post about this below.
In summary, TblOrderSubProcess will be your fact table. TblOrder would be split into logical dimensions.