clickhouse

AggregatingMergeTree in ClickHouse. Can I transfer state of function to another table?


I wish to calculate some methrics by time granulations. So, I have a table of facts (user actions), then I calculate into AggregatingMergeTree for each day. What I want is to take state of uniqState and use it for weekly calculations to write to another AggregatingMergeTree table.
Is it possible? I can get result by using uniqMerge, but it's useless for storing.

Here is how I calculate for daily granulation:

CREATE TABLE IF NOT EXISTS analytics.test_AggregatingMergeTreeEngine_day
(
    event_day_datetime DATETIME, 
    ad_request SimpleAggregateFunction(sum, UInt64),
    audience_reach AggregateFunction(uniq, String, String),
    cur_sub_location int,
) ENGINE = AggregatingMergeTree()

ORDER BY (event_day_datetime, cur_sub_location)
;
CREATE MATERIALIZED VIEW IF NOT EXISTS analytics.test_AggregatingMergeTreeEngine_mv_day
            TO analytics.test_AggregatingMergeTreeEngine_day
            (
                `event_day_datetime` DATETIME,
                `ad_request` SimpleAggregateFunction(sum, UInt64),
                `audience_reach` AggregateFunction(uniq, String, String),
                `cur_sub_location` int,
            )
AS
SELECT toStartOfDay(event_datetime),
       countIf(event == 49),
        uniqState(device_id, san),
        cur_sub_location
    FROM analytics.UserEvents

GROUP BY toStartOfDay(event_datetime), cur_sub_location
ORDER BY toStartOfDay(event_datetime), cur_sub_location

What I want is something like this: (it's just a draft and it doesn't work)

CREATE TABLE IF NOT EXISTS analytics.test_AggregatingMergeTreeEngine_week
(
    event_week_datetime DATETIME, 
    ad_request SimpleAggregateFunction(sum, UInt64),
    audience_reach AggregateFunction(uniq, String, String),
    cur_sub_location int,
) ENGINE = AggregatingMergeTree()

ORDER BY (event_week_datetime, cur_sub_location)
;
CREATE MATERIALIZED VIEW IF NOT EXISTS analytics.test_AggregatingMergeTreeEngine_mv_week
            TO analytics.test_AggregatingMergeTreeEngine_week
            (
                `event_week_datetime` DATETIME,
                `ad_request` SimpleAggregateFunction(sum, UInt64),
                `audience_reach` AggregateFunction(uniq, String, String),
                `cur_sub_location` int,
            )
AS
SELECT toStartOfWeek(event_day_datetime),
       sum(ad_request),
        uniqState(audience_reach),
        cur_sub_location
    FROM analytics.UserEvents

GROUP BY toStartOfWeek(event_day_datetime), cur_sub_location
ORDER BY toStartOfWeek(event_day_datetime), cur_sub_location

Yet for some reason I cannot even do this:

select
    audience_reach
from analytics.test_AggregatingMergeTreeEngine_day
--final -- error changes if I apply final
limit 1

Can I somehow get unfinished state of uniqState() function?


Solution

  • Yes, you can transfer uniqState from day table to week table

    use uniqStateMerge and something like that

    CREATE TABLE IF NOT EXISTS analytics.test_AggregatingMergeTreeEngine_week
    (
        event_week_datetime DateTime,
        ad_request SimpleAggregateFunction(sum, UInt64),
        audience_reach AggregateFunction(uniq, String, String),
        cur_sub_location Int32
    )
    ENGINE = AggregatingMergeTree
    ORDER BY (event_week_datetime, cur_sub_location);
    

    and

    CREATE MATERIALIZED VIEW IF NOT EXISTS analytics.test_AggregatingMergeTreeEngine_mv_week
    TO analytics.test_AggregatingMergeTreeEngine_week
    AS
    SELECT
        toStartOfWeek(event_day_datetime) AS event_week_datetime,
        sum(ad_request) AS ad_request,
        uniqMergeState(audience_reach) AS audience_reach,
        cur_sub_location
    FROM analytics.test_AggregatingMergeTreeEngine_day
    GROUP BY
        event_week_datetime,
        cur_sub_location;