amazon-web-servicespaginationamazon-dynamodbboto3nosql

DynamoDB pagination using Boto3


We are using boto3 for our DynamoDB and we need to do a full scan of our tables to enable to do that based on other post we need to do a pagination. However, we are unable to find a working sample of pagination. Here is what we did.

import boto3
client_setting = boto3.client('dynamodb', region_name='ap-southeast-2')
paginator = client_setting.get_paginator('scan')
esk = {}
data = []
unconverted_ga = ourQuery(params1, params2)
    for page in unconverted_ga:
        data.append(page)
        esk = page['LastEvaluatedKey']

We dont know exactly how to make the esk as the ExclusiveStartKey of our next query. What should be the expected value of ExclusiveStartkey parameter? We are still new in DynamoDB and there's many things we need to learn including this. thanks!


Solution

  • After hour of search, i've finally found a better solution. For those who are new to DynamoDB, we should'nt missed this - http://docs.aws.amazon.com/amazondynamodb/latest/gettingstartedguide/GettingStarted.Python.04.html

    from __future__ import print_function # Python 2/3 compatibility
    import boto3
    import json
    import decimal
    from boto3.dynamodb.conditions import Key, Attr
    
    # Helper class to convert a DynamoDB item to JSON.
    class DecimalEncoder(json.JSONEncoder):
        def default(self, o):
            if isinstance(o, decimal.Decimal):
                if o % 1 > 0:
                    return float(o)
                else:
                    return int(o)
            return super(DecimalEncoder, self).default(o)
    
    dynamodb = boto3.resource('dynamodb', region_name='us-west-2', endpoint_url="http://localhost:8000")
    
    table = dynamodb.Table('Movies')
    
    fe = Key('year').between(1950, 1959)
    pe = "#yr, title, info.rating"
    # Expression Attribute Names for Projection Expression only.
    ean = { "#yr": "year", }
    esk = None
    
    
    response = table.scan(
        FilterExpression=fe,
        ProjectionExpression=pe,
        ExpressionAttributeNames=ean
        )
    
    for i in response['Items']:
        print(json.dumps(i, cls=DecimalEncoder))
    
    // As long as LastEvaluatedKey is in response it means there are still items from the query related to the data
    while 'LastEvaluatedKey' in response:
        response = table.scan(
            ProjectionExpression=pe,
            FilterExpression=fe,
            ExpressionAttributeNames= ean,
            ExclusiveStartKey=response['LastEvaluatedKey']
            )
    
        for i in response['Items']:
            print(json.dumps(i, cls=DecimalEncoder))