I am trying to get vector embeddings on scale for documents.
from langchain_community.embeddings import BedrockEmbeddings
package.embeddings = BedrockEmbeddings( credentials_profile_name="default", region_name="us-east-1", model_id="amazon.titan-embed-text-v2:0" )
to embed documentsbatch
) using embeddings.embed_documents(batch)
. This works.botocore.errorfactory.ModelTimeoutException: An error occurred (ModelTimeoutException) when calling the InvokeModel operation: Model has timed out in processing the request. Try your request again.
File "...\site-packages\langchain_community\embeddings\bedrock.py", line 150, in _embedding_func raise ValueError(f"Error raised by inference endpoint: {e}") ValueError: Error raised by inference endpoint: An error occurred (ModelTimeoutException) when calling the InvokeModel operation: Model has timed out in processing the request. Try your request again.
Does anyone have any pointers or know if BedrockEmbeddings
provides any function that attempts to try again due to a timed out
error?
Figured it! Apparently, embedding functions do not like long spaces, new lines (\n) in the text. So, if they are present in your documents, they will negatively impact the embedding or simply stall.
I removed the spaces in the text, and it worked!
On a side note, it was VS Code that split the sentences to make it easier for reading and programming when I prepped my text. And that's how the \n
s got introduced within my texts.