sqlpostgresqlcountpostgresql-9.6aggregate-filter

Get conditional count and conditional DISTINCT count in a single SELECT


I have 3 tables:

TABLE: session_log
 user_id | device | logged_on
---------|--------|---------------------
 1       | web    | 2022-01-01 12:43:25
 1       | web    | 2022-01-01 13:33:32
 2       | mobile | 2022-01-01 18:20:18
 1       | mobile | 2022-01-01 08:22:41
 2       | web    | 2022-01-01 09:10:16
 3       | web    | 2022-01-01 07:52:21
 1       | web    | 2022-01-02 10:42:14

TABLE: standard_users
 user_id | username
---------|-----------
 1       | adam
 2       | jennifer

TABLE: admin_users
 user_id | username
---------|-----------
 3       | george

I would like to get counts of number of unique non-admin(standard) users for each day and each device type (mobile or web). Another note to mention is if a user logged into both web and mobile on the same day, I would still like to include them in the counts for both device types but no more than once for each device for that day.

From the above table records, The resulting counts I would like to get should look like the following:

 login_date | mobile_count | web_count
------------|--------------|-----------
 2022-01-01 | 2            | 2
 2022-01-02 | 0            | 1

Currently I have a query made that doesn't take the unique users condition into account, and I'm having trouble figuring out how to modify it to also factor in the unique user ids per count.

SELECT
    s.logged_on::date AS login_date,
    sum(CASE WHEN s.device = 'mobile' THEN 1 ELSE 0 END) AS mobile_count,
    sum(CASE WHEN s.device = 'web' THEN 1 ELSE 0 END) AS web_count,
FROM session_log s
LEFT JOIN standard_users su ON su.user_id = s.user_id 
WHERE su.user_id IS NOT NULL
GROUP BY login_date;

The above query that I have so far currently comes up with 3 for web_count when run on the sample data above because it isn't counting the uniqueness of the user_id so it's counting the record with user_id of 1 twice for January 1st.

Is there a way to modify my query to also factor in the uniqueness of the user_id when performing each of the sums?


Solution

  • Use the aggregate FILTER clause. Then you can combine your count with DISTINCT:

    SELECT s.logged_on::date AS login_date
         , count(*)                FILTER (WHERE s.device = 'mobile') AS mobile_count
         , count(DISTINCT user_id) FILTER (WHERE s.device = 'web') AS web_count
    FROM   session_log s
    JOIN   standard_users su USING (user_id)
    GROUP  BY login_date;
    

    See:

    I also simplified your twisted formulation with LEFT JOIN and then IS NOT NULL. Boils down to a plain JOIN.

    If referential integrity between session_log.user_id and standard_users.user_id is enforced with a FK constraint, and standard_users.user_id is defined UNIQUE or PK - as seems reasonable - you can drop the JOIN completely:

    SELECT logged_on::date AS login_date
         , count(*)                FILTER (WHERE device = 'mobile') AS mobile_count
         , count(DISTINCT user_id) FILTER (WHERE device = 'web') AS web_count
    FROM   session_log
    GROUP  BY 1;