arraysclickhousecolumn-oriented

Comparing two arrays in ClickHouse rows


Any options to compare two arrays in ClickHouse?

There are two columns colA and colB, each contains an array.

If there any algorithm that compares arrays in colA and colB for each row in a ClickHouse table and sets colC value to 1 if arrays are equal, 0 if arrays are not equal?

For example:

colA                             |  colB                            | colC
---------------------------------|----------------------------------|-----
{555,571,701,707,741,1470,4965}  |  {555,571,701,707,741,1470,4965} |1
{555,571,701,707,741,1470,4965}  |  {555,571,701,707,741,1470,4964} |0

Solution

  • I asked the same question at ClickHouse Google Group and got this answer from Denis Zhuravlev:

    In the latest version of CH 18.1.0, 2018-07-23 (#2026):

    select [111,222] A,  [111,222] B, [111,333] C, A=B ab, A=C ac
    

    results in

    ┌─A─────────┬─B─────────┬─C─────────┬─ab─┬─ac─┐
    │ [111,222] │ [111,222] │ [111,333] │  1 │  0 │
    └───────────┴───────────┴───────────┴────┴────┘
    

    Before 18.1.0 you can use lambdas or something:

    SELECT 
       NOT has(groupArray(A = B), 0) ab
      ,NOT has(groupArray(A = C), 0) ac
      FROM
        (
          SELECT
             [111,222] A
            ,[111,222] B
            ,[111,333] C
        )
        ARRAY JOIN
             A
            ,B
            ,C 
    
    
    ┌─ab─┬─ac─┐
    │  1 │  0 │
    └────┴────┘