javaolapmondrian

How to store data for mondrian queries?


I'm trying to understand how I should store data in order to query it with MDX queries processed by mondrian. I am having trouble storing data as i don't know if I should include redundancy in the data.

For example, I have facts for YEAR_2017 and for each of its quarter ( Q1_2017, Q2_2017, Q3_2017 and Q4_2017).

The measure value for YEAR_2017 facts are indeed the sum of values for each quarter. To the same extent, I store TOP_HIERARCHY values and its CHILD_HIERACHY values, generating redundancy once more.

The result is, when I query TOP_HIERARCHY for YEAR_2018, I end up with a result that is the sum of every child value of my TOP_HIERARCHY and for each of them, the sum of every quarter and the value of the year.

In the end I end up with completely wrong results, so my question is, should I simply but values of childless hierarchies in the fact table ?


Solution

  • You shouldn't keep both the granular and aggregated data in the same table.

    If your data aggregates correctly, then you should just keep the most atomic data values in the fact table. Mondrian takes care of aggregating the months into quarters, quarters into years, etc.

    You would normally create a star schema: one fact table and several dimension tables.

    Each dimension will have a column for each level of the hierarchy. So, your date dimension would have a primary key, date_id, then year, quarter, month_number, month_name, day_of_month and any other field you may want to add to allow drilling down.

    Your fact table will have 1 column for each key to each dimension, then 1 column for each measure. An example would be date_id, customer_id, product_id, quantity_sold, total_amount.

    The Mondrian schema will specify how to aggregate the measures (normally, sum), and issue the necessary SQL queries with the correct joins, group by and aggregations.