I'm trying to create a summary report of our orders but having trouble extracting all my required data in a single query.
The data I'd like to extract:
Orders table (simplified for this example)
| orderId | deliveryTotal | total |
|---------|---------------|-------|
| 1 | 5 | 15 |
| 2 | 5 | 15 |
| 3 | 7.50 | 27.50 |
Order items table
| orderItemId | orderId | productId | salePrice | quantity |
|-------------|---------|-----------|-----------|----------|
| 1 | 1 | 1 | 10 | 1 |
| 2 | 2 | 1 | 10 | 1 |
| 3 | 3 | 1 | 10 | 1 |
| 4 | 3 | 2 | 10 | 1 |
My current query for extracting this data is
SELECT
SUM(i.salePrice * i.quantity) as subtotal,
SUM(DISTINCT o.deliveryTotal) as deliveryTotal,
COUNT(DISTINCT o.orderId) as orders,
SUM(i.quantity) as quantity
FROM orderItems i
INNER JOIN orders o ON o.orderId = i.orderId
Which results in a correct subtotal, order count and quantity sum. But delivery total is returned as 12.50 when I'm after 17.50. If I do SUM(o.deliveryTotal)
it will return 25.
EDIT: Desired results
| subtotal | deliveryTotal | orders | quantity |
|----------|---------------|--------|----------|
| 40.00 | 17.50 | 3 | 4 |
Because of the join, the SUM(DISTINCT deliveryTotal)
aggregate is being applied to a rowset including the values 5, 5, 7.5, 7.5
(distinct 5 + 7.5 = 12.5
).
The rows your SUM()
acted on become more apparent if you simply do
SELECT o.*
FROM orderItems i
INNER JOIN orders o ON o.orderId = i.orderId
Instead you are asking for the SUM()
of all the values in deliveryTotal
, irrespective of their position in the join with orderItems
. That means you need to apply the aggregate at a different level.
Since you are not intending to add a GROUP BY
later, the easiest way to do that is to use a subselect whose purpose is only to get the SUM()
across the whole table.
SELECT
SUM(i.salePrice * i.quantity) as subtotal,
-- deliveryTotal sum as a subselect
(SELECT SUM(deliveryTotal) FROM orders) as deliveryTotal,
COUNT(DISTINCT o.orderId) as orders,
SUM(i.quantity) as quantity
FROM orderItems i
INNER JOIN orders o ON o.orderId = i.orderId
Subselects are usually discouraged but there won't be a significant performance penalty for the subselect, none different from the alternative methods of using a join for it. The calculation has to be done on a separate aggregate from the existing join no matter what. Other methods would place a subquery CROSS JOIN
in the FROM
clause, which performs the same thing we placed here in the subselect. Performance would be the same.