I tried to retrieve documents with similar content to later modify them, but when it came to updating, I realized I couldn't get the ID to update these documents.
Here’s the function I was using to retrieve documents:
def retrieve(text):
"""Retrieve information related to a query."""
retrieved_docs = db.similarity_search(text, k=3)
return retrieved_docs
This is how the document structure looks when printed
Document(metadata={'projectName': ''}, page_content="")
And in Weaviate:
{
"uuid": "",
"metadata": {
"creationTime": ""
},
"properties": {
"projectName": "",
"text": ""
},
"vectors": {
"default": []}}
And here’s the db configuration, don't know, maybe it helps:
db = WeaviateVectorStore(
client=client,
index_name="Cases",
embedding=embedding_model,
text_key="text"
)
I haven't found any information about a similar case, everywhere people are interacting with existing uuid. How to get uuid? Or maybe there’s another way to achieve this? For example, saving information as new each time, but will this compromise accuracy over time?
I’d appreciate your response!
Duda Nogueira from Weaviate here!
Can you confirm you are using Langchain and not Llamaindex?
For Langchain, you need to explicitly request this with;
docs = db.similarity_search("traditional food", return_uuids=True)
print(docs[0].metadata.get("uuid"))
I just noticed this is an undocumented feature! I have added it here in our langchain recipes to make it visible:
https://github.com/weaviate/recipes/tree/main/integrations/llm-frameworks/langchain/loading-data
Let me know if this helps!
Edit: Langchain integration will only perform searches. If you want to fetch objects and sort them, this is how you can do it:
from weaviate.classes.query import Sort, MetadataQuery
query = collection.query.fetch_objects(
sort=Sort.by_update_time(ascending=False),
return_metadata=MetadataQuery(last_update_time=True),
limit=3
)
for obj in query.objects:
print("#")
print(obj.uuid)
print(obj.metadata.last_update_time)
# update the last object, so it goes back to top
collection.data.update(uuid=query.objects[2].uuid, properties={"text": "updated!"})
This should be the output, and everytime you run, the last object will go to the top:
#
032502ec-2598-4941-aa58-9e576308cb9d
2025-03-06 14:31:42.654000+00:00
#
00351c67-a31f-4b90-8a76-198aadf2a1ca
2025-03-06 14:31:32.236000+00:00
#
014735a7-732f-41a5-8026-ab29395b88c3
2025-03-06 14:28:36.576000+00:00
Thanks!