How would you put together a SQL query for data.stackexchange that will display the most active users (in terms of answers given) for a tag in a given location?
EG. something similar to the top 30 listed here https://stackoverflow.com/tags/ruby-on-rails-3/topusers but location specific.
So top Ruby Answerers in the last 30 days in Berlin etc
Thanks!
So, after looking at the database schema this is the query I came up with.
-- Top 10 Ruby Answerers in the last 30 days in Berlin based on score
select top 10
u.displayname,
number_of_answers = count(*),
total_score = sum(p.score)
from
users u
join
posts p on p.owneruserid = u.id -- joined to get answer posts
join
posts pp on p.parentid = pp.id -- post parent is the question
join
posttags pt on pt.postid = pp.id -- tags for post parent
join
tags t on t.id = pt.tagid -- tags for tag name
where
t.tagname like '%ruby%' -- tags to filter for
and -- includes everything ruby inc. rails
p.creationdate > (getdate()-30) -- past 30 days
and
u.location like '%Berlin%' -- locations differ in format
group by
u.displayname
order by
3 desc; -- order by total score for "best" answerers
-- order by 2 (count) to get most active
I'm not an expert in the data explorer schema so it's possible that the query isn't quite correct and there are some caveats: the date filter applies to the question and not the answer so it's possible that there are users with more answers in the last 30 days overall if they have answered older questions, and also, the location is a really unreliable field as many users haven't specified location at all. It's probably as close as it's possible to get though.
The data explorer isn't that hard to use - experiment a bit with it and you'll realize how the tables are connected. It's a good exercise :)