pythonamazon-web-servicesaws-lambdaboto3amazon-bedrock

processing pdf documents using bedrock runtime


I am trying to use AWS Bedrock Runtime for document understanding:

(model:amazon.nova-premier-v1:0)

Here is my current code:

def process():
    with open("file.pdf", "rb") as f:
        document_bytes = f.read()
    bedrock_client = boto3.client("bedrock-runtime", "us-east-1")
    response = bedrock_client.converse(
        modelId=MODEL_ID,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "text": "explain this document",
                        "document": {
                            "format": "pdf",
                            "name": "file",
                            "source": {
                                "bytes": document_bytes,
                            },
                        },
                    },
                ],
            },
        ],
    )

    print(response)

    return response

Here is the error I am encountering:

botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid number of parameters set for tagged union structure messages[0].content[0]. Can only set one of the following keys: text, image, document, video, toolUse, toolResult, guardContent, cachePoint, reasoningContent.

The code works fine if I remove the messages[0].content[0].document

but according to the error message 'document' is part of the tagged union.

I looked at the Docs, the example shows that 'document' can be used this way.

The example on the docs is similar to what I wrote. Can someone explain why is it still throwing this error.


Solution

  • The content parameter is an array. You have to pass the document as one element of the array, and the text as another element, something like this:

    messages=[
                {
                    "role": "user",
                    "content": [
                        {
                            "document": {
                                "format": "pdf",
                                "name": "file",
                                "source": {
                                    "bytes": document_bytes,
                                },
                            },
                        },
                        {
                            "text": "explain this document",
                        },
                    ],
                },
            ],