sqlpostgresql

Only display the latest available value per customer based on a certain date


DB-Fiddle

CREATE TABLE vouchers (
    id SERIAL PRIMARY KEY,
    customer VARCHAR,
    collected_date DATE,
    balance INT
);

INSERT INTO vouchers
(customer, collected_date, balance
)
VALUES 
('customer_a', '2024-03-09', '1600'),
('customer_a', '2024-04-20', '2700'),
('customer_a', '2024-05-24', '3100'),
('customer_a', '2025-03-09', '1500'),
('customer_a', '2025-04-20', '400'),
('customer_a', '2025-05-24', '0'),
('customer_b', '2024-10-17', '200'),
('customer_b', '2025-02-15', '1000'),
('customer_b', '2025-10-17', '800'),
('customer_b', '2025-11-18', '950'),
('customer_b', '2026-02-15', '1150'),  
('customer_b', '2026-11-18', '0');

Expected Result:

customer selected_date balance
customer_a 2025-03-31 1500
customer_b 2025-03-31 1000

In the expected result I want to display only the latest available balance per customer based on a certain date.
In this example 2025-03-31.

So far I have been able to develop this query:

select
    customer as customer,
    collected_date as collected_date,
    sum(balance) as balance,
    max(collected_date) over (partition by customer) AS max_date
from vouchers
where collected_date < '2025-03-31' 
group by 1, 2
order by 1, 2 desc;

It almost brings me to the correct result. However, I have not been able to make it work fully.
What do I need to change to get the expected result?


Solution

  • You can use DISTINCT ON and ORDER BY collected_date DESC

    Thus, you will get the latest row per customer only that satisfies your WHERE clause.

    SELECT
      DISTINCT ON (customer)
      customer, 
      '2025-03-31' AS selected_date, 
      balance
    FROM
      vouchers
    WHERE collected_date < '2025-03-31'
    ORDER BY 
      customer, 
      collected_date DESC;
    

    See this db<>fiddle with your sample data.