apache-flinkflink-streamingflink-sql

Flink SQL left join inserting null values to right table even after showing matching records


I have been doing left join on a stream data in apache flink sql i.e. converting two datastream api to the Flink SQL, however it is giving the null for example:

table 1

id Dept
id 1 Dept 1
id 2 Dept 2
id 3 Dept 3
id 4 Dept 4

Table 2

id Employee
id 1 Employee 1
id 2 Employee 2
id 1 Employee 3
id 3 Employee 4

I am doing left join on the ID: left table: table1 right table: table2

it gives output as

id Dept Employee
id 1 Dept 1 Employee 1
id 2 Dept 2 Employee 2
id 1 Dept 1 Employee 3
id 3 Dept 3 Employee 4
id 1 Dept 1 null

Data is going on for n values

Behaviour of this flink sql join in not understood, as it is giving null for id 1 after matching value with right table

Table joinResult = streamTableEnv.sqlQuery("SELECT join_source_1.id, join_source_1.dept, join_source_2.employeeName FROM join_source_1 LEFT JOIN join_source_2 ON join_source_1.id = join_source_2.id”);

Expecting to get non-null value in the right table once the record is matched

Also the Data is consumed from the kafka streaming source

Thank you for the help in advance


Solution

  • I've only seen unexpected results like this from Kafka in situations where a too-short retention policy kicks in and removes the data that was previously available.