postgresqldistinct-values

PostgreSQL: count unique values from multiple columns


Lets have a table with two columns [col1, col2] with some values. values in col1 and col2 can be repeated. I would like to get number of unique values from both columns.

select
   count(distinct col1) as col1_unique,
   count(distinct col2) as col2_unique,
   count(distinct (col1, col2)) as total_unique
from myTable

that returns total_unique as combination of col1, col2 which is always bigger then a sum of col1_unique and col2_unique

as example: table with rows:

1 1
1 2
1 3
2 1
2 2
2 2

should return col1_unique as 2, col2_unique as 3 and total_unique 3

I can add a select for col1 and col2 and then distinct values from the select, but is there better (nicer) way to solve the task?


Solution

  • A tagged union of the column values and then conditional aggregation would look and read very well. Not necessarily very efficient though.

    select count(distinct col) filter (where tag = 1),
           count(distinct col) filter (where tag = 2),
           count(distinct col)
    from (
      select col1 as col, 1 as tag from the_table
      union all
      select col2, 2 from the_table
    ) t;
    

    Demo