sqlrtransitive-closure-table

Data frame based on transitivity property of


I have a data frame as

A:    V1 V2  
      1   3    
      1   4    
      3   4    
      1   6
      6   5

I want output which satisfies transitive property on V1 and V2

B:    V1 V2 V3
       1  3  4   

Solution

  • The idea is you select one source and try to find the transitivity with two targets. if those are the same then you have the right combination.

    I add additional columns for debug purpose, but the query can be simplify a little bit more.

    SQL DEMO

    SELECT *
    FROM (
            SELECT source.[V1], source.[V2],
                   target1.[V1] as t1_v1,
                   target1.[V2] as t1_v2,
                   target2.[V1] as t2_v1,
                   target2.[V2] as t2_v2,
                   CASE WHEN source.[V1] = target1.[V1] 
                        THEN target1.[V2]
                        ELSE target1.[V1]
                   END as transitive1,
                   CASE WHEN source.[V2] = target2.[V2] 
                        THEN target2.[V1]
                        ELSE target2.[V2]
                   END as transitive2     
            FROM A as source
            JOIN A as target1
              ON      (source.[V1] = target1.[V1] OR source.[V1] = target1.[V2])
              AND NOT (source.[V1] = target1.[V1] AND source.[V2] = target1.[V2])
            JOIN A as target2    
              ON      (source.[V2] = target2.[V1] OR source.[V2] = target2.[V2])
              AND NOT (source.[V1] = target2.[V1] AND source.[V2] = target2.[V2])
         ) T
    WHERE T.transitive1 = T.transitive2
    

    OUTPUT

    enter image description here

    To get the result you want select the right columns and add aditional filter

    SELECT T.[V1] as [V1], 
           T.[V2] as [V2], 
           T.[transitive1] as [V3]
    
    ....
    
    WHERE T.[V1] > T.[V2]
      AND T.[V2] > T.[transitive1]
      AND T.transitive1 = T.transitive2