langchainfunction-callollamalanggraphllama3

Llama3.2 fails to respond to simple text inputs when bounded with tool calling on LangGraph


I am following along a LangChain tutorial for LangGraph. They are using OpenAI models in the tutorial. However, I want to use my local Ollama models. I am using Llama 3.2 as that supports tool callings. However, when I bind tools to the chat object, llm, it does not respond to normal text inputs and only returns a tool response. If it is not bound with a tool, it does respond to regular messages I am not sure and I cannot figure out whether it is an issue of the LangChain class or the Llama 3.2 model. How to fix this?

Following is the code:

from langchain_ollama import ChatOllama
from langgraph.graph import MessagesState
from langgraph.graph import StateGraph, START, END
from langchain_core.messages import HumanMessage


def multiply(a: int, b: int) -> int:
    return a * b


def tool_calling_llm(state: MessagesState):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}


llm = ChatOllama(model="llama3.2")
llm_with_tools = llm.bind_tools([multiply])

builder = StateGraph(MessagesState)
builder.add_node("tool_calling_llm", tool_calling_llm)
builder.add_edge(START, "tool_calling_llm")
builder.add_edge("tool_calling_llm", END)
graph = builder.compile()

messages = graph.invoke({"messages": HumanMessage(content="hello")})
print(messages)

This is the print result:

{'messages': [HumanMessage(content='hello', additional_kwargs={}, response_metadata={}, id='b3e1122b-400b-4f6d-b323-0b67f2aa1441'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2024-10-21T11:07:58.132452Z', 'message': {'role': 'assistant', 'content': '', 'tool_calls': [{'function': {'name': 'multiply', 'arguments': {'a': '2', 'b': '3'}}}]}, 'done_reason': 'stop', 'done': True, 'total_duration': 1372686042, 'load_duration': 30459500, 'prompt_eval_count': 156, 'prompt_eval_duration': 752771000, 'eval_count': 22, 'eval_duration': 584293000}, id='run-28808b40-51d2-40ee-a9bf-e048a009651c-0', tool_calls=[{'name': 'multiply', 'args': {'a': '2', 'b': '3'}, 'id': 'b45ba4cc-7610-4d1a-93bc-de4deb6e6c1d', 'type': 'tool_call'}], usage_metadata={'input_tokens': 156, 'output_tokens': 22, 'total_tokens': 178})]}


Solution

  • Like Laurie Young mentioned, you need bigger models of Llama. I took thought it was Langgraph that was inconsistent. Rewrote my application without Langgraph in python. No use. Llama 1B is the issue here. It can either use LLM or it can use tools. It cannot combine both. The fix is to use a better model.

    It took a few days of breaking my head and testing with python/Pydantic and extensive online search and testing with ChatGPT to realize that Llama is the issue. I wish Meta had better documentation on this. Its appalling that they did not mention this anywhere on their documentation. What a waste of my time! I have decided to give up on Llama and stick to ChatGPT just because how unhelpful the documentation is. ChatGPT saves a lot of time, community is bigger and their models are just better. The only downside is the amount of space required. But nobody can put a price on time wasted on a model which is so far behind ChatGPT.