I have a database that has segment_id,beat_id, patient_id
In dynamoDB version 2 when I do a scan with the following command I can only get values for 1 specific patient. When I input other segment,patients values I get a ThroughputExceededException.
table.scan(segment_id__eq='xCrKYvnfZlm6VCQ',beat_id__gt=1,patient_id__eq='3854520.edf')
The scan you are performing reads every item in the DynamoDB table and returns it if it meets the specified conditions (segment_id__eq='xCrKYvnfZlm6VCQ',beat_id__gt=1,patient_id__eq='3854520.edf’). Each read (even if the item does not meet the conditions) consumes your provisioned read capacity. If you are looking to retrieve a single record, it will be most efficient to use the GetItem or BatchGetItem calls to DynamoDB because you will only consume read capacity for the specified items. If you are looking to retrieve a specific range of records, it will be more efficient to use a Range Key or Global or Local Secondary Index so that you can Query the items because you will only consume read capacity for all items meeting the query criteria. Could you please provide more information about the table schema?
See this developer guide that describes the differences between scan and query in detail.
An example of using a query would be if segment_id was the hash key and beat_id was the range key. You could query all records with a specified segment_id and specified beat_id range. This will only consume the read capacity required to retrieve those specific records, rather than reading the entire table. Additionally, you can apply a query filter to other attributes like patient_id so only the results you want are returned.
More details on scan/query consumed capacity:
Query and scan are both eventually consistent reads, so one read capacity unit will let you read at up to 8KB per second.
If you still experience throttling, here are some ways to mitigate the exception:
More details on scan pricing:
To figure out how much read capacity you need to use Scan or Query to read items in your table:
To figure out how much read capacity you need to use GetItem or BatchGetItem to read items in your table:
As an example, suppose I have 10 items in my table, they are all 1KB, and I am planning to retrieve them all with eventually consistent operations. If I retrieve them with GetItem, each individual item will consume 1/2 of a read capacity unit, so the total cost will be 1/2 * 10 = 5 read capacity units. If I retrieve them with scan, the total size of all items combined is 10KB, which will consume 2 read capacity units.