I have a some data in the following format:
Bus | Route |
---|---|
Slowcoach | SC123 |
Slowcoach | SC123 |
Slowcoach | SC123 |
Slowcoach | SC555 |
Slowcoach | SC555 |
Slowcoach | SC555 |
Slowcoach | SC555 |
Slowcoach | SC111 |
SpeedyTram | ST111 |
SpeedyTram | ST111 |
SpeedyTram | ST222 |
SpeedyTram | ST222 |
SpeedyTram | ST222 |
SpeedyTram | ST222 |
SpeedyTram | ST333 |
SpeedyTram | ST444 |
I want to count the Routes and then show only the most popular 2, grouped by Bus :
Bus | Route | Count |
---|---|---|
Slowcoach | SC555 | 4 |
Slowcoach | SC123 | 3 |
SpeedyTram | ST222 | 4 |
SpeedyTram | ST111 | 2 |
I have the following so far:
SELECT Bus, Route, COUNT(Route)
FROM my_table
GROUP BY Bus, Route
ORDER BY Bus, COUNT DESC
I have looked at Rank / Partition /Limit but I can't get the COUNT field to work, nor can I work out the correct syntax.
Group and count (ti
) then order/assign position by bus (tx
) and select these with position <= 2.
select bus, route, cnt
from
(
select *, row_number() over (partition by bus order by cnt desc) r
from
(
select bus, route, count(*) cnt
from the_table
group by bus, route
) ti
) tx
where r <= 2;
The same using ti
and tx
as CTEs, maybe cleaner and more readable:
with ti as
(
select bus, route, count(*) cnt from the_table group by bus, route
),
tx as
(
select *, row_number() over (partition by bus order by cnt desc) r from ti
)
select bus, route, cnt from tx
where r <= 2;
Edit
If you need the result with ties then use dense_rank()
instead of row_number()
.