I have a sample data like below:
Username | Activity | Start Time | End Time |
---|---|---|---|
Ace | Dancing | 13:00 | 14:00 |
Ace | Singing | 15:00 | 16:30 |
Ace | Yoga | 19:00 | 20:00 |
Alice | Piano | 10:00 | 11:00 |
Alice | Hiking | 14:00 | 15:00 |
Alice | Reading | 16:00 | 16:30 |
Alice | Swimming | 19:00 | 20:00 |
Alice | Writing | 21:00 | 21:30 |
Lion | Fishing | 13:00 | 17:00 |
I want to find the penultimate activity for each user, if only records one activity, then output that activity, that is, I would like to get the following output:
Username | Penultimate_Act |
---|---|
Ace | Singing |
Alice | Swimming |
Lion | Fishing |
I tried but it cannot get Lion
:
SELECT username, activity AS penultimate_act
FROM (
SELECT
username,
activity,
DENSE_RANK() OVER (PARTITION BY username ORDER BY starttime DESC) AS seq
FROM activities
) t
WHERE seq = 2;
Then I tried again:
SELECT username, activity AS penultimate_act
FROM (
SELECT
username,
activity,
DENSE_RANK() OVER (PARTITION BY username ORDER BY starttime DESC) AS seq
FROM activities
) t
WHERE seq = 2 or seq = 1;
But it still doesn't quite meet my needs, could anyone provide some help?(If you haven’t heard of my database, you can use PostgreSQL to demo instead. Thank You!)
I would keep the query you have and then add a second query to fetch those usernames with their activity which appear once only. You can combine the two queries with UNION ALL
.
Use a GROUP BY
clause with HAVING
to select only those usernames occuring only once:
-- your existing query...
UNION ALL
SELECT username, MAX(activity)
FROM activities
GROUP BY username
HAVING COUNT(*) = 1;
Note: It might be faster to use ROW_NUMBER
rather than DENSE_RANK
in your query.
Both will produce the expected result for your sample data:
Username | Penultimate_Act |
---|---|
Ace | Singing |
Alice | Swimming |
Lion | Fishing |
See the demo.