I'm new to Apache Ignite and I can't understand the behavior of the join clause in my case.
There is a cluster of 12 Apache Ignite nodes and I connect to one of them with sqlline.sh. There are two tables PUBLIC.TABLE1 and PUBLIC.TABLE2, both of them have records with string keys SMID = '1b024b28' and SMID = '36bd1084'.
select t.SMID
from PUBLIC.TABLE1 t
where t.SMID in ('1b024b28', '36bd1084');
| t.SMID |
|---|
| 1b024b28 |
| 36bd1084 |
select t.SMID
from PUBLIC.TABLE2 t
where t.SMID in ('1b024b28', '36bd1084');
| t.SMID |
|---|
| 1b024b28 |
| 36bd1084 |
but when I try to join these tables only one record matches even if I switch tables:
select t1.SMID, t2.SMID
from PUBLIC.TABLE1 t1 left join PUBLIC.TABLE2 t2 on t2.SMID = t1.SMID
where t1.SMID in ('1b024b28', '36bd1084');
| t1.SMID | t2.SMID |
|---|---|
| 1b024b28 | 1b024b28 |
| 36bd1084 |
select t2.SMID, t1.SMID
from PUBLIC.TABLE2 t2 left join PUBLIC.TABLE1 t1 on t2.SMID = t1.SMID
where t2.SMID in ('1b024b28', '36bd1084');
| t2.SMID | t1.SMID |
|---|---|
| 1b024b28 | 1b024b28 |
| 36bd1084 |
Seems like the second record is visible for direct select but invisible for join. What's the problem with it and how to fix it?
I was expecting to get this:
select t1.SMID, t2.SMID
from PUBLIC.TABLE1 t1 left join PUBLIC.TABLE2 t2 on t2.SMID = t1.SMID
where t1.SMID in ('1b024b28', '36bd1084');
| t1.SMID | t2.SMID |
|---|---|
| 1b024b28 | 1b024b28 |
| 36bd1084 | 36bd1084 |
The issue might be caused by the fact that SMID is not a part of the affinity key for the tables. In that case, the join is considered non-colocated, while, by default, Ignite treats each join as colocated. Due to that, the produced result set may be incorrect.
There are two options to resolve the issue:
You can find more details regarding that in the documentation here: data colocation and distributed joins.