zodlangchain-js

LangChainJS Structured Tool Call Generates Incompatible JSON Schema for Complex Zod Types


I'm seeking advice on langchainjs structured tools implementation. I understand this is quite a mental stretch you're about to see, pardon me for that.

Problem Definition

Assume you have the following tool:

export class CreateChildNotionPageTool extends StructuredTool {
  name = "createChildNotionPageTool";
  description = `
    Creates new notion page inside parent page from provided input.
    Accepts children as content of the page.
  `;
  schema = createChildNotionPageToolSchema;

  private api = new NotionAPIWrapper();

  protected async _call({
    page,
    children,
  }: CreatePagePayload): Promise<unknown> {
    return this.api.createPage({ ...page, children });
  }
}

Schema and argument type for this tool are defined as follows:

const createChildNotionPageToolSchema = z.object({
  page: CreateChildPageSchema,
  children: z.array(NotionBlockSchema).optional(),
});

type CreatePagePayload = z.infer<typeof createChildNotionPageToolSchema>;

The createChildNotionPageToolSchema describes the complete set of possible objects (page, notion blocks, rich text definition, etc). The JSON schema ends up as a highly complex and enormous object with $ref values to cross-reference objects in it.

Schema is combined from multiple files:

/schemas
  --- pageProperties.schema.ts
  --- baseBlock.schema.ts
  --- notionPage.schema.ts
  --- blocks
      ---- callout.schema.ts
      ---- paragraph.schema.ts
....
....

Whenever I try to use this tool together with ChatOpenAI model:

const llm = new ChatOpenAI({ model: "gpt-3.5-turbo" }).bindTools([
    createChildNotionPageTool
  ]);

  const messages: any[] = [
    new HumanMessage(
      `Create new page "Brief Specification" with folder icon in parent page with id: 2655cfc1a73048cdb7e4dd4545d1d52d.
       The page content is split by three headings: Scope, Description, Outcomes
       Each heading section should contain autogenerated text.
      `
    ),
  ];

  const aiMessage: any = await llm.invoke(messages);

The call fails with the following message:

Invalid schema for function 'createNotionPageTool'. Please ensure it is a valid JSON Schema.

Debugging Done

The resulting JSON schema ended up with multiple $ref values in it. I figured that LLMs may not receive complex schemas as part of function tool calls well. Particularly, in my understanding, OpenAI chat can't parse JSON schemas with cross-referenced ($ref) values in it.

Luckily, there are libraries: zod-to-json-schema and json-to-zod-schema that can help you to debug it. My steps were the following:

  1. Convert my zod schema into a json schema with zod-to-json-schema tool with $refStrategy: none configuration option. It will create a schema out of zod definition and will not create cross-references in it.
  2. Convert the output from the first step back into zod object via json-to-zod-schema package. It returned me a flattened zod structure that is now converted into JSON schema without references inside of Langchain's function tool execution.

These steps allowed me to come up with a new schema that I could set in a Structured Tool that converts the request to llm correctly and the code runs successfuly.

But this poses another problem. The double conversion of a zod schema has to be done manually, and it is too large to be maintained manually further. It results in 4k rows of code of the schema alone:

   line 1:    z.object({
                 page: z.object(/*multiple zod objects here*/),
                 children: z.array(/* and even more here*/)
   line 4363: })

After digging deeper, I found out that langchain's ChatOpenAI implementation uses zod-to-json-schema under the hood to convert zod to output that can be sent into llm. But there is no way to pass a configuration object for zod-to-json-schema inside of the structure tool to tell it not to build JSON schema with $ref cross-references from my original zod schema.

Finally, the Question:

Is there a way to enforce a certain way of creating JSON schema for a structured tool, in particular, to tell zod-to-json-schema inside structured tool implementation to use configuration options, like $refStrategy: none? Or, maybe, there are different ways to create a tool that is still reusable and can accept function parameters schema directly.


Solution

  • Alright, figured it out. It appears you can use model-specific definitions of function tools. In the case of OpenAI it is resolved as follows:

    const llm = new ChatOpenAI({ model: "gpt-3.5-turbo" });
      const llmWithTools = llm.bindTools([{
          type: "function",
          function: {
            name: "createChildNotionPageTool",
            description: `
                Creates new notion page inside parent page from provided input. 
                Accepts children as content of the page.
              `,
            parameters: zodToJsonSchema(parametersSchema, { $refStrategy: "none" }),
          },
      }]
    );
    

    The approach above allows full control over parameters field.