data-warehousedimensional-modelingfact-table

Fact Table and Grain - Repeating Measures?


Suppose I have tables as indicated below. I'm trying to create correct fact tables.

If I combine both tables into one fact table, I would be repeating the sales measure since each sale contains at least two items (admin fee, cleaning and so on - number of items can vary) that contribute to the overall sale. For car 1, the overall sale would therefore be 5000 + 50 + 100. I cannot simply aggregate either because sales needs to be itemized.

Do I need to create two fact tables, or do I need to approach this in an altogether different manner? I'm tempted to link Car Sale Table to Charge Table in a 1:n fashion.

Do you have any suggestions?

Car Sale Table

car_id sale currency
1 5000 USD
2 7000 USD

Charge Table

charge_id type amount currency car_id
14 admin fee 50 USD 1
15 cleaning 100 USD 1
22 parking 10 USD 2
25 cleaning 70 USD 2

Solution

  • To denormalize the measures (here sale) is a very bad idea (as you noted) this will cause the sum calculation to fail.

    But you do not need to combine you two fact tables in this way.

    If your main goal is to calculate summaries per car, simple add a new charge_id (say 100 - with type = sale) and add the data to the second table.

    So the fact table will contain three rows for the car_id = 1

    charge_id   type        amount
    14          admin fee   50
    15          cleaning    100
    100         sales       5000
    

    The first table will be not needed.

    The total cost calculation will be a plain aggregation of the amount per car_id.

    You'll want to add some other attributes such as booking and validity timestamps.