sqlpostgresqljsonb

Compare entire rows of data based on a single matching column value


I have the following Postgres SQL table called sval: db<>fiddle

create table sval(id,closing_price,opening_price,ISIN,closing_fx,market_cap)
as values(104, 55.3, 44, 'KKJJ102',0,0)
        ,(432, 99  , 77, 'JJII333',0,0)
        ,(444, 44  , 33, 'KKJJ102',0,0)
        ,(888, 33  , 41, 'JJEOD23',0,0)
        ,(422, 99  , 77, 'JJII333',0,0)
        ,(222, 33  , 41, 'JJEOD23',0,0);

What I am trying to do is query for matching ISINs, and then comparing the entire rows for the rows that have a matching ISIN.

So in the example above it would be comparing ids:

And I want to return id's of the rows where the ISIN is matching, but the other column values are different. In the example above rows 104 and 444 would be returned, since their ISIN's match but the other values do not match.

However, rows 432 matches with 422 but their rows values are exactly the same. Same with 888 and 222.

It is guaranteed that there is ALWAYS exactly two ISINs that match.


Solution

    1. Self-join the table using(ISIN).
    2. to_jsonb(sval) to convert the whole row.
    3. -'id' to get rid of the identifier.
    4. <> to keep rows that differ on any other column than the id you just got rid of.
    5. Take only distinct ones, using least() and greatest() to get ordered pairs.

    Nice thing about this is that you don't need to know or list the column names, which sort of emulates select * exclude(col) type of syntax.
    demo at db<>fiddle

    select distinct 
           least(a.id, b.id)    as id1
         , greatest(a.id, b.id) as id2
    from sval as a 
    join sval as b using(isin)
    where (to_jsonb(a)-'id')<>(to_jsonb(b)-'id');
    
    id1 id2
    104 444

    You can also trade distinct for a different join condition that spits out sorted pairs:

    select a.id, b.id
    from sval as a 
    join sval as b 
      on a.isin=b.isin
     and a.id<b.id --this makes sure lower one's on the left
     and (to_jsonb(a)-'id')<>(to_jsonb(b)-'id');
    

    If you prefer to get them vertically instead of paired up:

    select a.id, a.isin
    from sval as a 
    join sval as b using(isin)
    where(to_jsonb(a)-'id')<>(to_jsonb(b)-'id');
    

    Here are alternatives that also give you a diff:

    select a.id, b.id, ldiff, rdiff
    from sval as a 
    join sval as b 
      on a.isin=b.isin
     and a.id<b.id
     and (to_jsonb(a)-'id')<>(to_jsonb(b)-'id')
    cross join lateral
      (select jsonb_object_agg(k,v1) as ldiff
            , jsonb_object_agg(k,v2) as rdiff
        from jsonb_each(to_jsonb(a))a(k,v1)
        join jsonb_each(to_jsonb(b))b(k,v2)using(k)
        where v1<>v2
          and k<>'id') as d;
    
    id id ldiff rdiff
    104 444 {"closing_price": 55.3, "opening_price": 44} {"closing_price": 44, "opening_price": 33}
    777 999 {"closing_price": 77, "opening_price": 77} {"closing_price": 99, "opening_price": 99}

    Vertical:

    select a.id, a.isin, diff
    from sval as a 
    join sval as b using(isin)
    cross join lateral
      (select jsonb_object_agg(k,v1) as diff
        from jsonb_each(to_jsonb(a))a(k,v1)
        join jsonb_each(to_jsonb(b))b(k,v2)using(k)
        where v1<>v2
          and k<>'id') as d
    where(to_jsonb(a)-'id')<>(to_jsonb(b)-'id');
    
    id isin diff
    104 KKJJ102 {"closing_price": 55.3, "opening_price": 44}
    444 KKJJ102 {"closing_price": 44, "opening_price": 33}
    777 XXXXXXX {"closing_price": 77, "opening_price": 77}
    999 XXXXXXX {"closing_price": 99, "opening_price": 99}