How to return used context to answer using Langchain in Python

I have built a RAG system like this:

loader = PyPDFLoader(pdf_file_name)
raw_documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(raw_documents)
print(documents[-1])

Document(
   metadata={'source': '/Appraisal.pdf', 'page': 37},
   page_content='File No.\nProperty Address\nCity County State Zip Code\nClient10828\nBorrower or Owner John Smith & Kitty Smith\n29 Dream St\nDreamTown SC 99999\nSouthern First Bank\nBB Appraisals, LLC'
)

compressor = CohereRerank(
    top_n=top_n,
    model="rerank-english-v3.0",
    cohere_api_key=""
)

retriever = vectorstore.as_retriever(
    search_type="similarity", 
    search_kwargs={"k": top_n}
)

compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

response_schemas = [
    ResponseSchema(name="price", description="Price", type="float"),
    ResponseSchema(name="unit", description="Unit", type="int"),
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

rag_prompt = PromptTemplate(
    input_variables=["context","question"],
    template=template,
    partial_variables={"format_instructions": output_parser.get_format_instructions()},
)

rag_chain = (
    {"context": compression_retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | output_parser
)

query = "What is the price? How many units?"

response = rag_chain.invoke(query, config={"configurable": {"session_id": "abc123"}},)

But then my response is a JSON with my price and unit as keys only. And I would like to be able to have a "context" variable that stores the paragraphs used in my document that the algo relied upon to answer the questions.

Any idea how I could do that please?

Solution

There are two ways to do this

For pictorial rep. of the information, as to which document the LLM used, You would need to visit langchain_smith, you must also understand some methodologies in RAG - LIKE RAG-FUSION, this will help you create a RAG-FUSION used by the llm to get the documents it used to retrieve the information.
I am not sure about this, but you can try this function, pipe it with your compression_retriever chain. The idea of this function is for you to pass it along with the llm or the retriever as the case maybe, so as to make retrieving the docs easier

def unique_union_of_documents(docs: list) -> list[Any]:
    """
    Get the unique union of the documents
    Args:
    docs: The documents to be processed

    Returns:
    list: The unique union of the documents"""

    doc_news = [json.dumps(doc.page_content) for _ in docs]
    # find the unique union of the documents
    unique_union = list(set(doc_news))

    return [json.loads(doc) for doc in unique_union]

You can now call this function after your variable response