I made this exercise on SQL server: write a query that lists for each cluster the quantity of products that fall within it. The company wants to obtain an analysis of sales with respect to the average quantity of each product present in each order, classifying them into six clusters: Q1 (<15), Q2 (15-20), Q3 (21-25), Q4 (26-30), Q5 (31-35), Q6(>35). Write a query that lists, for each product, the product name and the cluster to which it belong. The database is northwind
select count(ProductName) as prod_num ,cluster
from (
select ProductName,
case
when avg(Quantity) < 15 then 'Q1'
when avg(Quantity) <= 20 then 'Q2'
when avg(Quantity) between 21 and 25 then 'Q3'
when avg(Quantity) between 26 and 30 then 'Q4'
when avg(Quantity) between 31 and 35 then 'Q5'
else 'Q6'
end
as cluster
from [Order Details] od join Products pr on od.ProductID=pr.ProductID
group by ProductName
) as clusters
group by cluster
order by cluster
OUTPUT
22 Q2
35 Q3
18 Q4
2 Q6
I also need to display values for Q1 and Q5.
You can always seed your initial counts, for example:
declare @clusters table (prod_num int, cluster nchar(2));
insert into @clusters values
(0, 'Q1'),(0, 'Q2'),(0, 'Q3'),(0, 'Q4'),(0, 'Q5'),(0, 'Q6');
select
t1.cluster,
t1.prod_num + isnull(t2.prod_num, 0) as prod_num
from
@clusters t1
left join
(
select count(ProductName) as prod_num ,cluster
from (
select ProductName,
case
when avg(Quantity) < 15 then 'Q1'
when avg(Quantity) between 15 and 20 then 'Q2'
when avg(Quantity) between 21 and 25 then 'Q3'
when avg(Quantity) between 26 and 30 then 'Q4'
when avg(Quantity) between 31 and 35 then 'Q5'
else 'Q6'
end
as cluster
from [Order Details] od join Products pr on od.ProductID=pr.ProductID
group by ProductName
) as clusters
group by cluster
) t2
on t1.cluster = t2.cluster
order by t1.cluster;
Now we have an initial count of zero for all groups and add to that the counts we found in our query.
Untested so please let me know if you find errors...