amazon-web-servicesamazon-dynamodbboto3dynamo-local

Create GSI on an attribute which has the value of Set


So i want to create a simple Dynamodb table called reminders which at the moment has 3 columns :

  1. reminder_id : This is the hash key
  2. reminder_tag: I want to have a global secondary index on this field . But at the same time i want to ensure that the tags attribute should have the datatype of Set . Because there can be multiple tags on a reminder.
  3. reminder_title: I also want to have a global secondary index on this field. This will be a string field.

I checked the documentation : https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/dynamodb.html#valid-dynamodb-types on what are the possible datatypes available in Boto3 .

So i have come up with this script :

import boto3


def create_reminders_table():
    """Just create the reminders table."""
    session = boto3.session.Session(profile_name='dynamo_local')
    dynamodb = session.resource('dynamodb', endpoint_url="http://localhost:8000")
    table = dynamodb.create_table(
        TableName='Reminders',
        KeySchema=[
            {
                'AttributeName': 'reminder_id',
                'KeyType': 'HASH'
            }
        ],
        AttributeDefinitions=[
            {
                'AttributeName': 'reminder_id',
                'AttributeType': 'S'
            },
            {
                'AttributeName': 'reminder_tag',
                'AttributeType': 'SS'
            },
            {
                'AttributeName': 'reminder_title',
                'AttributeType': 'S'
            }
        ],
        GlobalSecondaryIndexes=[
            {
                'IndexName': 'ReminderTagGsi',
                'KeySchema': [
                    {
                        'AttributeName': 'reminder_tag',
                        'KeyType': 'HASH'
                    }
                ],
                'Projection': {
                    'ProjectionType': 'INCLUDE',
                    'NonKeyAttributes': [
                        'reminder_title'
                    ]
                }
            },
            {
                'IndexName': 'ReminderTitleGsi',
                'KeySchema': [
                    {
                        'AttributeName': 'reminder_title',
                        'KeyType': 'HASH'
                    }
                ],
                'Projection': {
                    'ProjectionType': 'KEYS_ONLY'
                }
            }
        ],
        BillingMode='PAY_PER_REQUEST'
    )
    return table


if __name__ == '__main__':
    movie_table = create_reminders_table()
    print("Table status:", movie_table.table_status)

But when i run this i get the below issue:

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateTable operation: Member must satisfy enum value set: [B, N, S]

I searched and came across this question asked by someone which has the same issue : https://forums.aws.amazon.com/thread.jspa?messageID=613970

Can someone please help me with this since the solution of not providing a datatype either does not work .

Also is it possible to have an index on a an attribute which is of value Set ? I mean i should enable the user to search for reminders with a tag , and for doing that i need to have a set.

Request someone to please help me regarding this.


Solution

  • Is it possible to have an index on an attribute which is of value Set ?

    No. As the CreateTable docs say, "the attributes in KeySchema must also be defined in the AttributeDefinitions", to a data type to one of (S)tring, (N)umber or (B)inary."

    enable the user to search for reminders with a tag , and for doing that i need to have a set.

    A DynamoDB workaround for one-many relations is a composite sort key as in urgent#work. That would only be sensible for a small, fixed number of tags, though.

    Your least-bad option is to query by user (and perhaps further narrowing with some sort key), then filtering the results by tag membership outside DynamoDB. (N.B. The IN operator cannot be used in a Query's FilterConditionExpression, so it's of no use to you here).

    I want to have a global secondary index on reminder_title

    reminder_title is a poor candidate for an index primary key. An index's (and table's) primary key must ensure per-record uniqueness. A title would likely not. You probably need a combination of 3 elements, user_id, request_id and title, to ensure key uniqueness across records.

    Consider a composite primary key with, say, user_id for the Partition Key (= HASH) and a compound sort key in a new column (SK) that concatenates title#request_id. You would then search by-user-by-title with:

    user_id="Zaphod" AND begins_with(SK, "exercise")