streamapache-flinkflink-sql

How to visualize operators execution in flink-sql job graph if chaining is in action or not


I have flink-sql application that just perform simple simple insert into enrich table by joining multiple tables.

    create table T1 (...) WITH ( 'connector' = 'upsert-kafka','topic' = 'T1', ...)
    create table T2 (...) WITH ( 'connector' = 'upsert-kafka','topic' = 'T2', ...)
    create table enrich (...) WITH ( 'connector' = 'upsert-kafka','topic' = 'enrich', ...)


CREATE TEMPORARY VIEW distinct_t1 AS
SELECT *
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY change_date desc) AS rownum
FROM T1)
WHERE rownum = 1;


CREATE TEMPORARY VIEW distinct_t2 AS
SELECT *
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY change_date desc) AS rownum
FROM T2)
WHERE rownum = 1;

    insert into enrich 
    select ... from distinct_t1 t1 inner join distinct_t2 t2 ... t2.t1_id = t1.id and t2.client_id=t1.client_id

Note:

  1. Kafka partitions are 8 on raw topics and parallelism is set 8 on
  2. Max Parallelism is default (i believe that is 128 lower bound)
  3. TaskSlot=1 and total 8 TaskManagers processing are running

Question1: Based on above configuration, I am visualizing my job graph (including source, rank, join and sink operator running on TM1 reading from partition0 something like this.. am i right??

  1. TM1

    • T1-partition[0] - source_operator[0] -> rank_operator[0] -> Join
    • T2-partition[0] - source_operator[0] -> rank_operator[0] ->. > Sink Operator
  2. TM2

    • T1-partition1 - source_operator1 -> rank_operator1 -> Join
    • T2-partition1 - source_operator1 -> rank_operator1 ->. > Sink Operator ...
  3. TM8

    • T1-partition[7] - source_operator[7] -> rank_operator[7] -> Join
    • T2-partition[7] - source_operator[7] -> rank_operator[7] ->. > Sink Operator

Question2: how can i verify if my operators are chained or not ? or any configuration that i have to set in environment

Question3: In case if these operators are not chained as i am visualizing then how do i verify what operators are running where ??

Edit: I am using flink 1.17.1 and here is the job graph that just shows the operator but not other info like if they are chained or not


Solution

  • If you have access to the Flink Web UI, this information is presented there in an easy-to-understand way. This is normally running on the job manager at port 8081.

    Each blue box in the diagram is a task, or in other words, an operator chain. The operators shown together within the same blue box are chained. Nothing else is.