azurestreamazure-eventhubazure-stream-analytics

Azure Stream Analytics- Window Function to aggregate current event record referencing data from 12 AM UTC


I have a scenario where I need to aggregate only the latest events (keys) from the stream job with data from 12 a.m. UTC every day. It looks like I can't go with only hopping and sliding windows as they aggregate all of the records and key combinations from 12 AM every time until now. But I just wanted to aggregate current key referencing same older key data from 12 a.m. UTC.

Example:

Data in the event hub (until 10:59 AM):
1, 100, 5AM
2, 50, 8AM
3, 60, 10AM

Current Record at 11AM
2, 50, 11AM

Expected output
1, 100, 5AM
2, 150, 11AM
3, 60, 10AM

I don't want the stream job to re-execute aggregation for older keys 1, 3

There is something called Windows (https://learn.microsoft.com/en-us/stream-analytics-query/windows-azure-stream-analytics) where we can use different Window Functions like Hopping, Sliding, Tumbling in the same query. Can a combination of sliding and tumbling windows solve the problem? As the tumbling window will always aggregate the latest data with the sliding window, I have reference data from 12 a.m. UTC. Any help is really appreciated


Solution

  • I was able to resolve the issue with the help of the last 24-hour Sliding Window and filtered data before 12 AM in the where clause.

    with cte A as (
    select count(1),dt from table group by slidingwindow(hour,24), dt)
    select 
    * from cte where dt=System.TimeStamp()