apache-flinkwatermarkstream-processing

Why does my flink window trigger when I have set watermark to be a high number?


I would expect windows to trigger only after we wait until the maximum possible time as defined by the max lateness for watermark.

.assignTimestampsAndWatermarks( WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofMillis(10000000)) .withTimestampAssigner((order, timestamp) -> order.getQuoteDatetime().getTime())) .keyBy(order-> GroupingsKey.builder().symbol(order.getSymbol()).expiration(order.getExpiration()) .build()) .window(EventTimeSessionWindows.withGap(Time.milliseconds(100000000)))

In this example, why would the window ever trigger in any meaningful amount of time? The window is a very large window and we wait a very long time for records. When I run my example, the window still gets triggered in under a minute. why is that?


Solution

  • Turns out the watermark was being generated after the source was exhausted(in this case it was from reading a file). So the max watermark was emitted(9223372036854775807). A trigger happens when: window.maxTimestamp() <= ctx.getCurrentWatermark()

    See https://stackoverflow.com/a/51554273/1099123