I have a scenario where I need to aggregate only the latest events (keys) from the stream job with data from 12 a.m. UTC every day. It looks like I can't go with only hopping and sliding windows as they aggregate all of the records and key combinations from 12 AM every time until now. But I just wanted to aggregate current key referencing same older key data from 12 a.m. UTC.
Example:
Data in the event hub (until 10:59 AM):
1, 100, 5AM
2, 50, 8AM
3, 60, 10AM
Current Record at 11AM
2, 50, 11AM
Expected output
1, 100, 5AM
2, 150, 11AM
3, 60, 10AM
I don't want the stream job to re-execute aggregation for older keys 1, 3
There is something called Windows (https://learn.microsoft.com/en-us/stream-analytics-query/windows-azure-stream-analytics) where we can use different Window Functions like Hopping, Sliding, Tumbling in the same query. Can a combination of sliding and tumbling windows solve the problem? As the tumbling window will always aggregate the latest data with the sliding window, I have reference data from 12 a.m. UTC. Any help is really appreciated
I was able to resolve the issue with the help of the last 24-hour Sliding Window and filtered data before 12 AM in the where clause.
with cte A as (
select count(1),dt from table group by slidingwindow(hour,24), dt)
select
* from cte where dt=System.TimeStamp()