sqlpostgresqlselectgroup-byaggregate-filter

Combine two queries to count distinct strings with different filters


I am trying to write a single Postgres query that will result in a table which looks like this:

source |yes_gap|no_gap|
-------|-------|------|
allivet|     29|    25|
amazon |    692|   255|

I've been able to write two separate queries, but haven't been able to figure out how to combine them into one.

Here's my query for product_gap='yes':

select 
source,
count(distinct(sku)) as yes_gap
from product_gaps where 
product_gap='yes' and
ingestion_date <= '2021-05-25' 
/* aggregate by source */
group by source

Result:

source |yes_gap|
-------|-------|
allivet|     29|
amazon |    692|

And here's my query for product_gap='no':

select 
source,
count(distinct(sku)) as no_gap
from product_gaps where 
product_gap='no' and
ingestion_date <= '2021-05-25' 
/* aggregate by source */
group by source

Result:

source |no_gap|
-------|------|
allivet|    25|
amazon |   255|

Can I get both counts in a single query?


Solution

  • Much faster and simpler with conditional aggregates using the aggregate FILTER clause:

    SELECT source
         , count(DISTINCT sku) FILTER (WHERE product_gap = 'yes') AS yes_gap
         , count(DISTINCT sku) FILTER (WHERE product_gap = 'no')  AS no_gap
    FROM   product_gaps
    WHERE  ingestion_date <= '2021-05-25'
    GROUP  BY source;
    

    See:

    Aside 1: DISTINCT is a key word, not a function. Don't add parentheses for the single column. distinct(sku) is short notation for DISTINCT ROW(sku). It happens to work because Postgres strips the ROW wrapper for a single column, but it's just noise.

    Aside 2: product_gap should probably be boolean.