sqlduplicatessnowflake-cloud-data-platformdistinct

Snowflake to return all records from duplicate values


Experts, I have the query below, which returns all duplicate values from Object 1 on Table 1 and adds all other relevant records that come with it.

Select A1.Obj1, AM.Obj2, AM.Obj3, AM.Obj4, A1.Count1
From (Select distinct A.Obj1, Count(A.Obj1) as Count1
  From Table1 AS A 
  Left Join Table2 AS B ON A.Obj2= B.Obj2
  Where A.Obj3 <> '1'
    And A.Obj4 <> '1'
    And A.Obj5 = '1'
    And Left (A.Obj2, 1) <> 'A'
    And (Not A.Obj1 Is NULL And B.Obj1 Is NULL)
Group By 1
Having Count1>1
Order By 2 desc) A1
Left Join Table1 as AM on A1.Obj1=AM.Obj1
  Where A.Obj3 <> '1'
    And A.Obj4 <> '1'
    And A.Obj5 = '1'
    And Left (A.Obj2, 1) <> 'A'

However I want to add another criteria, I want to return only the values where AM.Obj2 is different among each Obj1 as in the example below

Obj1 Obj2 Obj3 Obj4
A X 1 1
A Y 1 1

If in the table above, in both records Obj2 were to be either XX or YY it should be filtered out, I tried creating another subquery and a separate count but I cant get the logic right, how could I achieve this? Thank you!


Solution

  • You could use the following aggregation approach:

    SELECT t1.Obj1, t1.Obj2, t1.Obj3, t1.Obj4
    FROM Table1 t1
    INNER JOIN (
        SELECT Obj1, Obj3, Obj4
        FROM Table1
        GROUP BY Obj1, Obj3, Obj4
        HAVING MIN(Obj2) <> MAX(Obj2)
    ) t2
        ON t2.Obj1 = t1.Obj1 AND
           t2.Obj3 = t1.Obj3 AND
           t2.Obj4 = t1.Obj4;
    

    The subquery aliased as t2 above finds all (Obj1, Obj3, Obj4) tuples having the property that the two Obj2 values are not the same. We then join back to the original Table1 table to retrieve the entire rows.