I have this simple table:
mysql> select deviceId, eventName, loggedAt from foo;
+----------+------------+---------------------+
| deviceId | eventName | loggedAt |
+----------+------------+---------------------+
| 1 | foo | 2020-09-18 21:27:21 |
| 1 | bar | 2020-09-18 21:27:26 |
| 1 | last event | 2020-09-18 21:27:43 | <--
| 2 | xyz | 2020-09-18 21:27:37 |
| 2 | last event | 2020-09-18 21:27:55 | <--
| 3 | last one | 2020-09-18 21:28:04 | <--
+----------+------------+---------------------+
and I want to select one row per deviceId
with the most recent loggedAt
. I've marked those rows with an arrow in the table above for clarity.
If I append group by id
in the above query, I get the notorious:
Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'foo.eventName' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
and I don't want to change the sql_mode
.
I've come pretty close to what I want using:
select deviceId, any_value(eventName), max(loggedAt) from foo group by deviceId;
but obviously the any_value
returns a random result.
How can I solve this?
ONLY_FULL_GROUP_BY
is a good thing: it enforces fundamental SQL standard rules, about which MySQL has been lax for a long time. Even if you were disabling it, you would get the same result as what you are getting with any_value()
.
You have a top-1-per group problem, where you cant the entire row that has the most recent date for each device. Aggregation is not the right tool for that, what you need is to filter the dataset.
One option uses a correlated subquery:
select f.*
from foo f
where f.loggedat = (
select max(f1.loggedate) from foo where f1.deviceid = f.deviceid
)
In MySQL 8.0, you can also use row_number()
:
select *
from (
select f.*, row_number() over(partition by deviceid order by loggedat desc) rn
from foo f
) f
where rn = 1