amazon-web-servicesdatabase-designamazon-dynamodbnosqldynamodb-queries

DynamoDB Index Design for Finding Items Where an Attribute Contains a Specific Value


Problem

I am currently in the process of designing the database for a software application which will be used by multiple organizations to each manage multiple projects. I am planning on using DynamoDB for the database. The question I have centres around one query I need to perform on the "Projects" table. The queries I will need to perform are as follows:

  1. Find Projects by organization_id.
  2. Find Project by id.
  3. Find Projects by status.
  4. Find Projects by members.

Note: Each Project entity will have an organization_id, id, status, and array of members (user id's).

The problem I am trying to solve is with query #4. Each project can have multiple members, and when a user logs into their dashboard, I want to fetch and display only the projects which that user is a part of. The issue is that the "members" field in a project entity is an array with user id's, which means the "members" of one project could be the same as another project, meaning it won't guarantee uniqueness.

One of the indexes I will use for this table will use the organization_id as the Partition Key and the id as the Sort Key. I would like to do something similar for the index I will use for query #4, but since "members" is not unique within its partition, I cannot use it as the Sort Key.

Possible Solutions

1 - User to Project Table

This solution would require me to create an extra table. This table would relate each user id to an array of project id's. When a user loads their dashboard, I could fetch their user entry for this table and do a Batch Item request and get all the projects by their id's in the array.

Pros:

Cons:

2 - Query with Filter Expression

For this solution, I would do a Query with organization_id as the Partition Key with a Filter set to find items where "members" CONTAINS the user id I am looking for.

Pros:

Cons:

3 - Other

I am hoping there exists another solution where I can perform a single request and get all the data I need. I see a Z-index as a possible solution but haven't fully figured out the best way to make use of it.

Any Help Appreciated

Does anyone have experience with a problem like this? I know this is a common data model so I imagine there must be a good way to do this, but I have not been unable to find a way that seems optimal. All comments are welcome. Thanks in advance!


Solution

  • Typically you'd map that like the following:

    PK SK Data OrgId
    project123 project123 all project data org0001
    project123 user001 any specific user data for this project
    project123 user002 any specific user data for this project
    project123 user099 any specific user data for this project
    project999 project999 all project data org0346

    Then you can create a GSI with SK as the partition key and that would give you all projects related to a user. Then you can do a BatchGetItem to obtain all the project metadata from the base table.

    An alternative approach is to overload your GSI so that you only have to make one index to supply you both with all projects for a given user or org.

    PK SK Data GSI1PK
    project123 project123 all project data org0001
    project123 user001 any specific user data for this project user001
    project123 user002 any specific user data for this project user002
    project123 user099 any specific user data for this project user099
    project999 project999 all project data org0346

    With this data model you can use GSI1PK as your index partition key. And you can supply either a userId or OrgId as partition key value depending on your lookup.