sqlpostgresqlgreatest-n-per-groupsql-limit

Find first 3 orders for each customer


I am trying to find the first 3 timestamps for each customer.

Sample data for table customer_orders:

customer_id timestamp
6778 '2022-01-01'
6778 '2022-02-05'
5544 '2022-04-01'
6778 '2022-02-04'
5544 '2022-04-03'
5544 '2022-04-02'
5544 '2022-01-01'
6778 '2021-01-01'

Desired outcome:

customer_id timestamp
5544 '2022-01-01'
5544 '2022-04-01'
5544 '2022-04-02'
6778 '2021-01-01'
6778 '2022-01-01'
6778 '2022-04-02'

My query so far:

SELECT
    customer_id, 
    timestamp
FROM customer_orders
GROUP BY customer_id, timestamp
ORDER BY timestamp ASC
LIMIT 3

LIMIT 3 limits to 3 rows overall. But I want 3 rows per customer.


Solution

  • You can use the window function ROW_NUMBER() to numerate the columns of a given PARTITION (customer_id in your case) within a CTE and in the outer query just filter the n records from this generated column:

    WITH j AS (
      SELECT customer_id, timestamp,
        ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY timestamp
                           RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS n
      FROM customer_orders
    ) 
    SELECT customer_id, timestamp FROM j
    WHERE n <= 3 
    ORDER BY customer_id, timestamp
    

    Demo: db<>fiddle