sqlsql-server

SQL to Group By top n occurring SKUs


I have a table like so called billing

sku_code | invoice_id | sku_amount | sku_sale
integer  | integer    | float      | float

What I want do is first find the top 350 sku_code by sku_amount

SELECT TOP 350 Sum(sku_amount) AS amt, 
       sku_code
FROM   billing 
GROUP  BY sku_code
ORDER  BY amt DESC 

I then want to slice the entire table by the table by the 350 sku_code that appear above

I assume this would be some sort of inner-join via a sub-query but I can't figure out the syntax.

The query I want to finally run is but with it only returning the 350 sku_codes from the above query.

SELECT sum(sku_amount) as amt,sku_code, invoice_id
from billing 

group by sku_code, invoice_id

order by amt DESC

This is giving me a table that will look like which should have around 20-30 million rows.

amt | sku_code | invoice_id

Solution

  • If I understand correctly, you want the original rows for the top 350 skus. A JOIN should suffice:

    SELECT b.*
    FROM billing b JOIN
         (SELECT TOP 350 Sum(sku_amount) AS amt, 
                 sku_code
          FROM billing 
          GROUP BY sku_code
          ORDER BY amt DESC 
         ) s
         ON s.sku_code = b.sku_code