I’m fairly proficient at SQL, however this question has had myself stumped for quite a while now. In the most basic sense, there are simply two tables:
Items
+----+--------+
| id | title |
+----+--------+
| 1 | socks |
| 2 | banana |
| 3 | watch |
| 4 | box |
| 5 | shoe |
+----+--------+
...and the prices table:
Prices
+---------+-----------+-------+------------+
| item_id | vendor_id | price | created_at |
+---------+-----------+-------+------------+
| 1 | 1 | 5.99 | Today |
| 1 | 2 | 4.99 | Today |
| 2 | 1 | 6.99 | Today |
| 2 | 2 | 6.99 | Today |
| 1 | 1 | 3.99 | Yesterday |
| 1 | 1 | 4.99 | Yesterday |
| 2 | 1 | 6.99 | Yesterday |
| 2 | 2 | 6.99 | Yesterday |
+---------+-----------+-------+------------+
(Please note: created_at is actually a timestamp, the words “Today” and “Yesterday” were provided merely to quickly convey the concept).
My goal is to get a simple result back containing the inventory item associated with the most recent, lowest price, including the reference to the vendor_id who is providing said price.
However, I find the stumbling block appears to be the sheer number of requirements for the statement (or statements) to handle:
It seems simple, but I’ve found this problem to be uncanningly difficult.
As a note, I’m using Postgres, so all the fanciness it provides is available for use (ie: window functions).
Much simpler with DISTINCT ON
in Postgres:
SELECT DISTINCT ON (p.item_id, p.vendor_id)
i.title, p.price, p.vendor_id
FROM prices p
JOIN items i ON i.id = p.item_id
ORDER BY p.item_id, p.vendor_id, p.created_at DESC;
SELECT DISTINCT ON (item_id)
i.title, p.price, p.vendor_id -- add more columns as you need
FROM (
SELECT DISTINCT ON (item_id, vendor_id)
item_id, price, vendor_id -- add more columns as you need
FROM prices p
ORDER BY item_id, vendor_id, created_at DESC
) p
JOIN items i ON i.id = p.item_id
ORDER BY item_id, price;
Detailed explanation:
Select first row in each GROUP BY group?