I'm using langchain library to save the information of my company in a Vector Database, and when I query for information the results are great, but need a way to recover where the information are comming too - like source: "www.site.com/about" or at least "document 156". Do any of you know how to do that?
EDIT: Currently, I'm using docsearch.similarity_search(query)
, what only return me the page_content, but metadata came empty
I'm ingesting with this code, but I'm totally open to change.
db = ElasticVectorSearch.from_documents(
documents,
embeddings,
elasticsearch_url="http://localhost:9200",
index_name="elastic-index",
)
You can add metadata to each of those documents by setting document.metadata
on each document to a dictionary. The dictionary could be something like {"source": "www.site.com/about"}
or {"id": "456"}
, to give some examples. Then, pass those documents to from_documents()
.
Later, when you get a Document
object back from one of the query methods, you can use document.metadata
to get the metadata back.