sqlgroup-bydb2highest

db2 query top group by


I've been trying for hours but can't ge the query, i want to do the following using DB2. From table Company and Users I have the following tickets quantity info per company/user

QUERY USING:
SELECT T.USER, COUNT(T.USER) AS QUANITTY, T.COMPANY FROM TICKET T
INNER JOIN COMPANY P ON P.COMPANY = T.COMPANY
GROUP BY (T.USER, T.COMPANY) ORDER BY QUANTITY DESC 

Outcome is:

user         company      quantity
----------------------------------
mark         nissn        300
tom          toyt         50
steve        kryr         80
mark         frd          20    
tom          toyt         120    
jose         toyt         230    
tom          nissn        145    
steve        toyt         10    
jose         kryr         35    
steve        frd          100

THIS SHOULD BE THE RESULT(Top user per company)

user         company      quantity
----------------------------------
mark         nissn        300    
jose         toyt         230    
steve        frd          100    
steve        kryr         80

as you can see there are many users in a company and each have different quantities per company, the result should get the user with the highest quantity per company. i.e. : Company nissn it has 2 users and each has (mark with 300) and (tom with 145)

so it should give me the highest user which would be mark with 300. The same would be for toyt, frd, kryr. I need all of them in a query.
I wonder if that's possible in a query or I will need to create a stored procedure.


Solution

  • You can do this with analytic queries. But be careful. The pattern usually works out to involve nested subqueries. (One to produce a dataset, the next to add it to the pattern, the third to select out the rows you want.)

    In this case it should look something like this.

    Original query.

    SELECT T.USER, COUNT(T.USER) AS QUANTITY, T.COMPANY
    FROM TICKET T
      JOIN COMPANY P
        ON P.COMPANY = T.COMPANY
    GROUP BY (T.USER, T.COMPANY)
    

    Analytic query. (Note that the s is to name the subquery. I have not used DB2, but the standard strictly doesn't prevent them to be dropped, and I know at least one database that requires them.)

    SELECT user, quantity, company
      , RANK () OVER (PARTITION BY company ORDER BY quantity DESC) as r
    FROM ( ... previous query ... ) s
    

    Final result.

    SELECT user, quantity, company
    FROM ( ... previous query ... ) t
    WHERE r = 1
    

    The combined query is:

    SELECT user, quantity, company
    FROM (
        SELECT user, quantity, company
          , RANK () OVER (PARTITION BY company ORDER BY quantity DESC) as r
        FROM (
            SELECT T.USER, COUNT(T.USER) AS QUANTITY, T.COMPANY
                FROM TICKET T
                  JOIN COMPANY P
                    ON P.COMPANY = T.COMPANY
                GROUP BY (T.USER, T.COMPANY)
          ) s
      ) t
    WHERE r = 1
    

    As I say I have not used DB2. But according to the SQL standard, that query should work.