I am using Single Table in DynamoDB and Group
and Thread
are stored in one table. One-to-many-relation. All threads are sorted by lastPostedAt
. lastPostedAt
is updated if a new comment or a new suggestion is posted in Thread
. PK=Group#{id} SK=Thread#{id}. it is no problem to get all threads in a group. GIS PK=GroupID SK=Thread#lastPostedAt
Access Pattern:
Get all threads from more than one group. all threads are sorted by lastPostedAt
.
How should I model for that access pattern with GIS?
Is it ok with large partitation key and Filterexpression? GIS PK=Thread (without id) SK=lastPostedAt Filterexpression=GroupID In ('Group1','Group2','Group3')
With filter expression, there can be difficulties in pagination if 'Group4' has a large number of elements compared to 'Group1', 'Group2' and 'Group3'.
If a group has a much larger number of elements than the other groups, this can lead to imbalances in pagination. If, for example, 'Group4' contains 1000 elements and 'Group1', 'Group2' and 'Group3' contain only 100 elements, it can be difficult to achieve a even distribution of the pagination across all groups.
schema.graphql
type Group
{
id: ID!
users: [User]
admins: [User]
title: String
}
type Thread
{
id: ID!
groupId: ID!
term: String
owner: user
lastPostedAt: AWSDateTime
}
I think the right answer here is that DynamoDB is not a good choice for this use case. Consider using a different DB technology or a secondary storage that builds your sorted view.
That being said, using a static hash key “Thread” is a possible option but would introduce scaling limits, as a single partition in DybamoDB can generally support only up to 3000 RSU and 1000 WSU per second. If you are sure that you’ll never hit that limit then go ahead.
Keep in mind that filtering on groups will not reduce amount of records that will be processed by DynamoDB during query. It only impacts returned result, but you will still be charged for all records (and performance will also suffer)