sqlpostgresqlgreatest-n-per-group

Postgres query table with max value in row with group by


Suppose I have the following data, and I want to know for each person what was their highest performing period, breaking ties in favor of more recent periods.

I would like to see ONLY the following rows:

5 (Alice's best row) 8 (Bob's best row) 9 (Charlie's best row) 11 (Danny's best, and only, row)

The stuff I have seen with window functions will bring back way more rows than I want.

Starting with something like

select employee_id, employee_name, quarter, year, 
max(sales_amount) over (partition by employee_id) 
from employeeperformance

This brings back all of the rows with the correct sales amount, but there is no practical way (yes could do join on the employee ID and the sales amount back to the original table and then order them by date and then cut out any tied rows, but there has to be a better way) to get back which period that is associated with, let alone with was the most recent period with that value.

Is there any way to get the 4 rows I am looking for?

INSERT INTO employeeperformance VALUES
(101,'Alice','Q2',2023,20000),
(101,'Alice','Q3',2023,20000),
(101,'Alice','Q4',2023,20000),
(101,'Alice','Q1',2024,20000),
(101,'Alice','Q2',2024,25000),
(102,'Bob','Q4',2023,15000),
(102,'Bob','Q1',2024,15000),
(102,'Bob','Q2',2024,30000),
(103,'Charlie','Q1',2024,28000),
(103,'Charlie','Q2',2024,22000),
(104,'Danny','Q2',2024,22000)

picture of data


Solution

  • DISTINCT ON makes this simple and fast:

    SELECT DISTINCT ON (employee_id)
           employee_id, employee_name, quarter, year, sales_amount
    FROM   employeeperformance
    ORDER  BY employee_id, sales_amount DESC, year DESC, quarter DESC;
    

    See:

    Aside: Your table seems in need of normalization.