sqlpostgresqlgroup-by

Get the top N rows by row count in GROUP BY


I'm querying a records table to find which users are my top record creators for certain record types. The basic starting point of my query looks something like this:

SELECT recordtype, createdby, COUNT(*)
FROM recordtable
WHERE recordtype in (...)
GROUP BY recordtype, createdby
ORDER BY recordtype, createdby DESC

But there are many users who have created records - I want to narrow this down further.

I added HAVING COUNT(*) > ..., but some record types only have a few records, while others have hundreds. If I do HAVING COUNT(*) > 10, I won't see that all 9 records of type "XYZ" were created by the same person, but I will have to scroll through every person that's created only 15, 30, 50, etc. of the 3,500 records of type "ABC."

I only want the top 5, 10, or so creators for each record type.

I've found a few questions that address the "select top N in group" part of the question, but I can't figure out how to apply them to what I need. The answers I could find are in cases where the "rank by" column is a value stored in the table, not an aggregate.

(Example: "what are the top cities in each country by population?", with data that looks like this:)

Country       City     Population
United States New York 123456789
United States Chicago  123456789
France        Paris    123456789

I don't know how to apply the methods I've seen used to answer that (row_number(), mostly) to get the top N by COUNT(*).


Solution

  • here is one way , to get top 10 rows in each group:

    select * from (
        select *, row_number() over (partition by recordtype order by cnt desc) rn
        from (
           SELECT recordtype, createdby, COUNT(*) cnt
           FROM recordtable
           WHERE recordtype in (...)
           GROUP BY recordtype, createdby
        ) t 
    ) t where rn <= 10