I have a DDB table with the following structure:
CustmerID (PK) | emailId (SortKey) | orderId (LSI) |
---|
I want to get batch items using AWS DynamoDB Java SDK for the following 2 patterns:
The first access pattern I was able to implement using
ReadBatch.Builder<Customer> customerBuilder = ReadBatch.builder(Customer.class)
.mappedTableResource(customerTable);
emails.forEach(email -> {
Customer key = new Customer();
key.setCustomerId(custId);
key.setEmail(email);
customerBuilder.addGetItem(key);
});
BatchGetItemEnhancedRequest batchGetItemEnhancedRequest =
BatchGetItemEnhancedRequest.builder()
.addReadBatch(customerBuilder.build())
.build();
When I am trying to perform the same operation using the LSI sort key i.e. orderId, the request fails with a 400 error.
slight change that I made for orderId, I couldn't find any API to set the LSI index name though.
orderIds.forEach(orderId -> {
Customer key = new Customer();
key.setCustomerId(custId);
key.setOrderId(orderId);
customerBuilder.addGetItem(key);
});
BatchGetItemEnhancedRequest batchGetItemEnhancedRequest =
BatchGetItemEnhancedRequest.builder()
.addReadBatch(customerBuilder.build())
.build();
Customer table bean
@DynamoDbBean
@Getter
@Setter
public class Customer {
private String customerId;
private String emailId;
private String orderId;
@DynamoDbPartitionKey
@DynamoDbAttribute("customerId")
public String getCustomerId() {
return customerId;
}
@DynamoDbSortKey
@DynamoDbAttribute("emailId")
public String getEmailId() {
return emailId;
}
@DynamoDbSecondarySortKey(indexNames = {"orderId-index"})
public String getOrderId() {
return orderId;
}
}
There are 2 approaches I could think of:
use PartiQL
SELECT * FROM "tableName"."lsiIndexName" WHERE "customerId" = ? and "orderId" in (?, ?, ?); // batch of 50
use filter expression by performing Query on primary index instead of LSI
String keyConditionExpression = "#customerId" + " = :partitionKeyVal";
String filterExpression = "#orderId" + " IN (:value1, :value2, :value, :value)";
// can't query against LSI unfortunately since the SDK doesn't allow filter expressions with multiple sort keys
QueryRequest queryRequest = QueryRequest.builder()
.tableName("customer")
.keyConditionExpression(keyConditionExpression)
.filterExpression(filterExpression)
.expressionAttributeNames(expressionAttributeNames)
.expressionAttributeValues(expressionAttributeValues)
.build();
Cost and performance wise what's the appropriate way to perform batch retrieval by LSI SKs considering that I could have more than 20k items per partition?
Firstly, note that BatchGetItem
just counts as multiple GetItem
, so that there is no cost savings here. The only savings is that you do it all in one HTTP call, so latency will be better than firing individual requests.
So, cost-wise, rather than the option you proposed where you get the entire partition and filter, it is actually more cost-efficient to loop through all the PK (Partition key and LSI combination) and fire multiple GetItem
. Unfortunately, there is no BatchGetItem
for LSI.
In any case, the inability to perform BatchGetItem
using LSI makes it clear that LSI is meant for single-item retrieval. If you really need batch retrievals, such as when firing individual requests results in too much HTTP overhead, separate into a second table instead. That is, create another table having the orders-related information with the PK being CustomerId and OrderId. Surely, if the original table PK is CustomerId and Email, OrderId cannot be an LSI? I mean just how many "emails" can an "order" have!