sqlsql-serverdataexplorer

SEDE query for finding the top user in all tags


I've currently created the following query on SEDE to find the top user in every tag by combined answer and score count. It can be found here: Top user in all tags by score and answer count. However, at the moment it is bringing back multiple top users per tag, which is understandable because I haven't put a restriction on that yet.

Here is the query:

SELECT TOP 50
       t.TagName,
       a.OwnerUserId AS [User Link],
       SUM(a.Score) / 10 AS Score,
       COUNT(a.Score) AS [Count],
       ((SUM(a.Score) / 10) + COUNT(a.Score)) / 2 AS Total
FROM Posts a, 
     Posts q
     INNER JOIN PostTags qt ON q.Id = qt.PostId
     INNER JOIN Tags t ON t.Id = qt.TagId
WHERE a.ParentId = q.Id
      AND a.PostTypeId = 2
      AND a.CommunityOwnedDate IS NULL
      AND a.OwnerUserId IS NOT NULL
GROUP BY a.OwnerUserId, t.TagName
ORDER BY ((SUM(a.Score) / 10) + COUNT(a.Score)) / 2 DESC

How can I make it so it only returns the top user per tag?


Solution

  • I would wrap your query in a CTE (Common Table Expression), then compute the max score per tag on a second CTE, and finally just join both CTEs to get the top users per tag. The query should look like:

    with user_tag as ( -- score per user, per tag
      SELECT t.TagName,
           a.OwnerUserId AS [User Link],
           SUM(a.Score) / 10 AS Score,
           COUNT(a.Score) AS [Count],
           ((SUM(a.Score) / 10) + COUNT(a.Score)) / 2 AS Total
      FROM Posts a
      JOIN Posts q on a.ParentId = q.Id
      JOIN PostTags qt ON q.Id = qt.PostId
      JOIN Tags t ON t.Id = qt.TagId
      WHERE a.PostTypeId = 2
        AND a.CommunityOwnedDate IS NULL
        AND a.OwnerUserId IS NOT NULL
      GROUP BY a.OwnerUserId, t.TagName
    ),
    max_score as ( -- max score per tag
      select TagName, max(Total) as max_score
      from user_tag
      group by TagName
    )
    select 
        u.*
      from user_tag u
      join max_score m on m.TagName = u.TagName
                      and m.max_score = u.Total
    

    I did not include any ordering since I'm not sure how do you want the rows. Please consider this query will show multiple rows for the same tag, if there are multiple users tied in the first place with the same score.