I have two tables which have a common key between them, and quite a lot of other important infos ; for the sake of simplicity i will be using Combination A and Combination B. When a combination is met, whichever table has the maximum number of records should be the source where i collect the information ; in this case say IDs. The priority when counts are same is Table1.
COMMONKEY column is the combination/join condition in my tables.
(Table 1)
SELECT '123' table1_id,'Comb A' commonkey from dual UNION
SELECT '124' table1_id,'Comb A' commonkey from dual UNION
SELECT '125' table1_id,'Comb A' commonkey from dual UNION
SELECT '126' table1_id,'Comb A' commonkey from dual UNION
SELECT '215' table1_id,'Comb B' commonkey from dual UNION
SELECT '216' table1_id,'Comb B' commonkey from dual UNION
SELECT '559' table1_id,'Random Combination 1' commonkey from dual UNION
SELECT '560' table1_id,'Random Combination 2' commonkey from dual ;
( Table 2 )
SELECT 'abc1' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc2' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc3' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc4' table2_id,'Comb A' commonkey from dual UNION
SELECT 'xyz1' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz3' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual UNION
SELECT '416abc1' table2_id,'Random Combination 91' commonkey from dual UNION
SELECT '416abc2' table2_id,'Random Combination 92' commonkey from dual;
Result Set Expected :
ID COMMONKEY
123 Comb A
124 Comb A
125 Comb A
126 Comb A
xyz1 Comb B
xyz2 Comb B
xyz3 Comb B
559 Random Combination 1
560 Random Combination 1
416abc1 Random Combination 91
416abc2 Random Combination 92
( the image shows a screenshot of the trail data in an excel; The Requirement and Strategy are color mapped to make it quickly understandable )
I need to generate the result set using SQL as follows :
When table1.commonkey = table2.commonkey hits, I need to-
Edit : I've initially gone on routes with
a left join b where b.key IS null ;
a full outer join b where b.key IS NULL or a.key is NULL ;
to achieve workarounds with A-B or B-A result sets but both these approaches were quite wrong. Gathering Delta sets or Exclusion sets didnt go well.
Here's one option; see comments within code
SQL> with
2 -- sample data
3 a (id, ckey) as
4 (select '123', 'ca' from dual union all
5 select '124', 'ca' from dual union all
6 select '125', 'ca' from dual union all
7 select '126', 'ca' from dual union all
8 select '215', 'cb' from dual union all
9 select '216', 'cb' from dual union all
10 select '551', 'r1' from dual union all
11 select '552', 'r2' from dual
12 ),
13 b (id, ckey) as
14 (select 'abc1', 'ca' from dual union all
15 select 'abc2', 'ca' from dual union all
16 select 'abc3', 'ca' from dual union all
17 select 'abc4', 'ca' from dual union all
18 select 'xyz1', 'cb' from dual union all
19 select 'xyz2', 'cb' from dual union all
20 select 'xyz3', 'cb' from dual union all
21 select '9991', 'r3' from dual union all
22 select '9992', 'r4' from dual
23 ),
24 -- count rows per each CKEY (common key)
25 tempa as
26 (select id, ckey, count(*) over (partition by ckey) cnt
27 from a
28 ),
29 tempb as
30 (select id, ckey, count(*) over (partition by ckey) cnt
31 from b
32 )
33 -- final query
34 select distinct
35 case when a.cnt >= b.cnt then a.id
36 else b.id
37 end id,
38 a.ckey
39 from tempa a join tempb b on b.ckey = a.ckey
40 union all
41 select ckey, id from a
42 where not exists (select null from b where a.ckey = b.ckey)
43 union all
44 select ckey, id from b
45 where not exists (select null from a where a.ckey = b.ckey)
46 order by 1, 2;
which results in
ID CKEY
---- -----
r1 551
r2 552
r3 9991
r4 9992
xyz1 cb
xyz2 cb
xyz3 cb
123 ca
124 ca
125 ca
126 ca
11 rows selected.
SQL>