node.jsdatabaseamazon-web-servicesamazon-qldbpartiql

Amazon qldb SELECT query to search specific rows with very less READIo's


I have a table with the following columns

UserID (index), SID (index), Cause (index), TimeValue (index), Amount

I want to query the cause of a specific user id, Meaning I want some cause which is equal to let's say water but I want this for only one user whose id is 'someId'

The query that I have right now is doing a full table scan

SELECT Cause, Cause_Amount, UserID FROM Contribution WHERE UserID = 'u5JvslEo9DbQ7hcq4vkM74dWlxr2' AND TimeValue > 1620414948000 AND ( Cause = 'cleanAir' OR Cause = 'safeWater')

So the approach should be that it should target userId with the given id then check the TimeValue then the cause and return the result.

I hope that makes sense


Solution

  • Generally speaking, you should put an index on your highest-cardinality document property. That means indexing on the property that is the most unique among your data. So if you have 10 users in your system but 1,000,000 documents in your Contribution table, that will need to scan 100,000 documents.

    If you are seeing high read-IOs it is like because you have a small number of users but a large number of Contributions. It seems unlikely that Cause would be high-cardinality and QLDB does not yet (9/8/2021) support range queries and so the range query on TimeValue probably won't help either.

    Are there other properties on your document that you could create an index on with higher cardinality?