I am implementing a single table design in dynamodb and overloading keys. My current design allows for an email to be subscribed to a thread.
This is a noSQL workbench screenshot:
I am using EM#\<email>
as the partition key and SB#\<thread id>
as the sort key. I am constructing a putItemCommand from nodeJS lambda env. Basically, the command works as expected.
Here is the payload:
new PutItemCommand({
TableName: 'sometable',
Item: {
pk: {
S: asEmail(email), //Resolves to `EM${email}`
},
sk: {
S: asSubscription(chain), //Resolves to `SB${chain}`
},
},
ConditionExpression: 'attribute_not_exists(pk)',
}),
Now I am just confused why this works. I am trying to ensure that the primary key (pk,sk) is unique so an email cannot be subscribed twice to a thread. But I am confused why
ConditionExpression: 'attribute_not_exists(pk)',
correctly accomplishes this. Reading this condition expression makes me believe that it is checking to make sure there is no partition key that matches. Is 'pk' an alias or does this have something to do with how dynamo retrieves data? I just need someone to spell this out for me.
From AWS's documentation on condition expressions:
The following example uses attribute_not_exists() to check whether the primary key exists in the table before attempting the write operation.
Note
If your primary key consists of both a partition key(pk) and a sort key(sk), the parameter will check whether
attribute_not_exists(pk)
ANDattribute_not_exists(sk)
evaluate to true or false before attempting the write operation....
--condition-expression "attribute_not_exists(Id)"
Note that if your table has a primary key and sort key, then both are required for each item, and they uniquely identify each item. That means that you can't have a duplicate (pk,sk) by definition. If you try to put a new object with the same (pk,sk) as an existing one (without the condition expression), you'll just overwrite the old one.
This counter-intuitive behavior of attribute_not_exists(pk)
comes from the fact that the (pk,sk) is the item's identifier. Imagine you tried to add an item at (EM#test2@testing.com
, SUB#900
) (which doesn't exist). DynamoDB will look up that item and ask, "does it have a pk attribute?" It doesn't (since it doesn't exist), so the put will succeed. If you try to put it a second time, the pk attribute will exist, and so the second put will fail.
Another way of looking at this is that since each item must have a pk and an sk, attribute_not_exists(pk) == attribute_not_exists(sk)
(they either both exist, or neither do), and so checking for one is equivalent for checking for both.