sqlpostgresqlaggregate-filter

Conditional count of rows where at least one peer qualifies


Background

I'm a novice SQL user. Using PostgreSQL 13 on Windows 10 locally, I have a table t:

+--+---------+-------+
|id|treatment|outcome|
+--+---------+-------+
|a |1        |0      |
|a |1        |1      |
|b |0        |1      |
|c |1        |0      |
|c |0        |1      |
|c |1        |1      |
+--+---------+-------+

The Problem

I didn't explain myself well initially, so I've rewritten the goal.

Desired result:

+-----------------------+-----+
|ever treated           |count|
+-----------------------+-----+
|0                      |1    |
|1                      |3    |
+-----------------------+-----+

First, identify id that have ever been treated. Being "ever treated" means having any row with treatment = 1.

Second, count rows with outcome = 1 for each of those two groups. From my original table, the ids who are "ever treated" have a total of 3 outcome = 1, and the "never treated", so to speak, have 1 `outcome = 1.

What I've tried

I can get much of the way there, I think, with something like this:

select treatment, count(outcome)
from t
group by treatment;

But that only gets me this result:

+---------+-----+
|treatment|count|
+---------+-----+
|0        |2    |
|1        |4    |
+---------+-----+

Solution

  • For the updated question:

    SELECT ever_treated, sum(outcome_ct) AS count
    FROM  (
       SELECT id
            , max(treatment) AS ever_treated
            , count(*) FILTER (WHERE outcome = 1) AS outcome_ct
       FROM   t
       GROUP  BY 1
       ) sub
    GROUP  BY 1;
    
     ever_treated | count 
    --------------+-------
                0 |     1
                1 |     3
    

    db<>fiddle here

    Read:

    Would be simpler and faster with proper boolean values instead of integer.