I have multiple streams that I want to join (A to B, B to C , C to D...) to create one Z
when using the table api and joining 3 tables
select * from A inner join B on a.pk_id = b.fk_id inner join C on b.pk_id = c.fk_id
what is/are the underlying state/s looks like?
the keys are different from each source, if it is running in parallel. does Flink reshuffle the data?
You can figure this out by looking at the job graph in the web UI. There's a shuffle being done everywhere you see a HASH connection.
This information is also included in the output of EXPLAIN <query>
, but that's harder to grok (look for Exchange
).