I have a table with the following columns
UserID (index), SID (index), Cause (index), TimeValue (index), Amount
I want to query the cause of a specific user id, Meaning I want some cause
which is equal to let's say water
but I want this for only one user whose id is 'someId'
The query that I have right now is doing a full table scan
SELECT Cause, Cause_Amount, UserID FROM Contribution WHERE UserID = 'u5JvslEo9DbQ7hcq4vkM74dWlxr2' AND TimeValue > 1620414948000 AND ( Cause = 'cleanAir' OR Cause = 'safeWater')
So the approach should be that it should target userId with the given id then check the TimeValue then the cause and return the result.
I hope that makes sense
Generally speaking, you should put an index on your highest-cardinality document property. That means indexing on the property that is the most unique among your data. So if you have 10 users in your system but 1,000,000 documents in your Contribution
table, that will need to scan 100,000 documents.
If you are seeing high read-IOs it is like because you have a small number of users but a large number of Contributions. It seems unlikely that Cause
would be high-cardinality and QLDB does not yet (9/8/2021) support range queries and so the range query on TimeValue
probably won't help either.
Are there other properties on your document that you could create an index on with higher cardinality?