Suppose I have tables as indicated below. I'm trying to create correct fact tables.
If I combine both tables into one fact table, I would be repeating the sales measure since each sale contains at least two items (admin fee, cleaning and so on - number of items can vary) that contribute to the overall sale. For car 1, the overall sale would therefore be 5000 + 50 + 100. I cannot simply aggregate either because sales needs to be itemized.
Do I need to create two fact tables, or do I need to approach this in an altogether different manner? I'm tempted to link Car Sale Table to Charge Table in a 1:n fashion.
Do you have any suggestions?
Car Sale Table
car_id | sale | currency |
---|---|---|
1 | 5000 | USD |
2 | 7000 | USD |
Charge Table
charge_id | type | amount | currency | car_id |
---|---|---|---|---|
14 | admin fee | 50 | USD | 1 |
15 | cleaning | 100 | USD | 1 |
22 | parking | 10 | USD | 2 |
25 | cleaning | 70 | USD | 2 |
To denormalize the measures (here sale
) is a very bad idea (as you noted) this will cause the sum calculation to fail.
But you do not need to combine you two fact tables in this way.
If your main goal is to calculate summaries per car, simple add a new charge_id
(say 100 - with type = sale
) and add the data to the second table.
So the fact table will contain three rows for the car_id = 1
charge_id type amount
14 admin fee 50
15 cleaning 100
100 sales 5000
The first table will be not needed.
The total cost calculation will be a plain aggregation of the amount
per car_id
.
You'll want to add some other attributes such as booking and validity timestamps.