[SOLVED] Cannot get token logprobs while using langchain structured output

Cannot get token logprobs while using langchain structured output

I am using langchain to call an LLM and I want to get the logprobs for each token. I want to get them after doing this:

from langchain_openai import ChatOpenAI
from pydantic import BaseModel

class ResponseFormat(BaseModel):
    question_index: int
    answer: str
    short_answer: str

llm = ChatOpenAI(
    openai_api_base="...",
    openai_api_key="...",
    model="...")

structured_llm = llm.with_structured_output(ResponseFormat, method="json_schema")
msg = llm.invoke(("human", "how are you today?"))
# ... there is no response_metadata

I tried adding .bind(logprobs=True) on both llm and structured_llm, but the result is the same.

The issue is known and described here and here but still the suggestion of adding include_raw doesn't work:

structured_llm = llm.with_structured_output(ResponseFormat, method="json_schema", include_raw=True)
msg = structured_llm.invoke(("human", "how are you today?"))
# ... msg["raw"].response_metadata["logprobs"] is None

The only reason I could think of is that I am contacting a LiteLLM proxy that in turn contacts azure/openai models and returns me the response, but I am surprised this case isn't discussed anywhere.

Details:

python version == 3.10.5
langchain-openai version == 0.2.1
langchain version == 0.3.2
pydantic version == 2.11.7

Solution

It is sufficient to pass specific parameters to the constructor:

llm = ChatOpenAI(
    openai_api_base="...",
    openai_api_key="...",
    model="...",
    logprobs=True, # required
    top_logprobs=5) # optional

structured_llm = llm.with_structured_output(ResponseFormat, method="json_schema", include_raw=True)
msg = llm.invoke(("human", "how are you today?"))

then you will have

msg["raw"].response_metadata["logprobs"]

One important point of attention is that (as of now) this parameter is not supported when using models like OpenAI o3 via LiteLLM