sqlazure-cosmosdbazure-cosmosdb-sqlapi

How do I get the latest record for each item in CosmosDB using SQL


I have a schema which is similar to

"id": "uuid",
"deviceId": "uuid",
"message": {
    "content": "string",
    "ts": 1
},
"data": {
    "temperature": 21
}

I'd like to get the latest "data" (using message.ts as the timestamp) for each "deviceId".

So far, I've managed to get the data back, in order of timestamp using the query SELECT c.deviceId, c.message.ts, c.data FROM c ORDER BY c.message.ts DESC but I can't figure out how to remove the duplicate device records.

Is this possible to do within the CosmosDB SQL Engine?


Solution

  • Thanks to Mark Brown's comment, I found the following which seems to be the correct solution to this problem. Not as elegant as just using some SQL for a one-off but is really what was needed.

    https://learn.microsoft.com/en-us/samples/azure-samples/cosmosdb-materialized-views/real-time-view-cosomos-azure-functions/

    In essence, you create a Serverless Function which is triggered by the Cosmos change feed and updates a materialized view, which is essentially just a document with (in this case) the most up to date data per deviceId.

    Specifically for this case, it'll most likely update the corresponding device document with it's most recent data.