An exercise I am working on presents this query written in BigQuery to teach how to use subqueries, CASE, and HAVING statements. I've learned about aliasing tables to simplify the query, but I don't understand why it is done so many times in this query?
SELECT
warehouse.warehouse_id,
CONCAT(warehouse.state, ': ', warehouse.warehouse_alias) AS warehouse_name,
COUNT(orders.order_id) AS number_of_orders,
(
SELECT
COUNT(*)
FROM
`coursera-461722.warehouse_orders.orders` AS orders
) AS total_orders,
CASE
WHEN COUNT(orders.order_id)/(SELECT COUNT(*) FROM `coursera-461722.warehouse_orders.orders` AS orders) <= 0.20
THEN 'Fulfilled 0-20% of Orders'
WHEN COUNT(orders.order_id)/(SELECT COUNT(*) FROM `coursera-461722.warehouse_orders.orders` AS orders) > 0.20
AND COUNT(orders.order_id)/(SELECT COUNT(*) FROM `coursera-461722.warehouse_orders.orders` AS orders) <= 0.60
THEN 'Fulfilled 21-60% of Orders'
ELSE 'Fulfilled more than 60% of Orders'
END AS fulfillment_summary
FROM
`coursera-461722.warehouse_orders.warehouse` AS warehouse
LEFT JOIN
`coursera-461722.warehouse_orders.orders` AS orders
ON orders.warehouse_id = warehouse.warehouse_id
GROUP BY
warehouse.warehouse_id,
warehouse_name
HAVING
COUNT(orders.order_id) > 0
The best I can guess is that the full table name and AS statement need to be written after every FROM statement, but elsewhere in the query such as in the COUNT and HAVING statements, the alias can simply be used. Thanks for any insight you can give!
Two things:
An alias only only applies to a single instance of the object it follows. So if the same object (e.g. table Orders
) is referenced multiple times in the same query, each one can be aliased. (And must be if the same object is referenced more than once in the same scope).
An alias only exists within the scope its created. So in your example below, the alias orders
only exists between the starting and ending brackets of the sub-query. In fact this alias is not used at all, so could be removed. To provide my point, change one of them to orders
and note that nothing changes.
(SELECT COUNT(*) FROM `coursera-461722.warehouse_orders.orders` AS orders)
As a best practice I suggest, short (1-4 chars), meaningful aliases, so I would use o
instead of orders
.
Note: There are better ways to obtain and use a count of the Orders
table multiple times, but thats out of scope for this question.