[SOLVED] LangChainJS Structured Tool Call Generates Incompatible JSON Schema for Complex Zod Types

LangChainJS Structured Tool Call Generates Incompatible JSON Schema for Complex Zod Types

I'm seeking advice on langchainjs structured tools implementation. I understand this is quite a mental stretch you're about to see, pardon me for that.

Problem Definition

Assume you have the following tool:

export class CreateChildNotionPageTool extends StructuredTool {
  name = "createChildNotionPageTool";
  description = `
    Creates new notion page inside parent page from provided input.
    Accepts children as content of the page.
  `;
  schema = createChildNotionPageToolSchema;

  private api = new NotionAPIWrapper();

  protected async _call({
    page,
    children,
  }: CreatePagePayload): Promise<unknown> {
    return this.api.createPage({ ...page, children });
  }
}

Schema and argument type for this tool are defined as follows:

const createChildNotionPageToolSchema = z.object({
  page: CreateChildPageSchema,
  children: z.array(NotionBlockSchema).optional(),
});

type CreatePagePayload = z.infer<typeof createChildNotionPageToolSchema>;

The createChildNotionPageToolSchema describes the complete set of possible objects (page, notion blocks, rich text definition, etc). The JSON schema ends up as a highly complex and enormous object with $ref values to cross-reference objects in it.

Schema is combined from multiple files:

/schemas
  --- pageProperties.schema.ts
  --- baseBlock.schema.ts
  --- notionPage.schema.ts
  --- blocks
      ---- callout.schema.ts
      ---- paragraph.schema.ts
....
....

Whenever I try to use this tool together with ChatOpenAI model:

const llm = new ChatOpenAI({ model: "gpt-3.5-turbo" }).bindTools([
    createChildNotionPageTool
  ]);

  const messages: any[] = [
    new HumanMessage(
      `Create new page "Brief Specification" with folder icon in parent page with id: 2655cfc1a73048cdb7e4dd4545d1d52d.
       The page content is split by three headings: Scope, Description, Outcomes
       Each heading section should contain autogenerated text.
      `
    ),
  ];

  const aiMessage: any = await llm.invoke(messages);

The call fails with the following message:

Invalid schema for function 'createNotionPageTool'. Please ensure it is a valid JSON Schema.

Debugging Done

The resulting JSON schema ended up with multiple $ref values in it. I figured that LLMs may not receive complex schemas as part of function tool calls well. Particularly, in my understanding, OpenAI chat can't parse JSON schemas with cross-referenced ($ref) values in it.

Luckily, there are libraries: zod-to-json-schema and json-to-zod-schema that can help you to debug it. My steps were the following:

Convert my zod schema into a json schema with zod-to-json-schema tool with $refStrategy: none configuration option. It will create a schema out of zod definition and will not create cross-references in it.
Convert the output from the first step back into zod object via json-to-zod-schema package. It returned me a flattened zod structure that is now converted into JSON schema without references inside of Langchain's function tool execution.

These steps allowed me to come up with a new schema that I could set in a Structured Tool that converts the request to llm correctly and the code runs successfuly.

But this poses another problem. The double conversion of a zod schema has to be done manually, and it is too large to be maintained manually further. It results in 4k rows of code of the schema alone:

   line 1:    z.object({
                 page: z.object(/*multiple zod objects here*/),
                 children: z.array(/* and even more here*/)
   line 4363: })

After digging deeper, I found out that langchain's ChatOpenAI implementation uses zod-to-json-schema under the hood to convert zod to output that can be sent into llm. But there is no way to pass a configuration object for zod-to-json-schema inside of the structure tool to tell it not to build JSON schema with $ref cross-references from my original zod schema.

Finally, the Question:

Is there a way to enforce a certain way of creating JSON schema for a structured tool, in particular, to tell zod-to-json-schema inside structured tool implementation to use configuration options, like $refStrategy: none? Or, maybe, there are different ways to create a tool that is still reusable and can accept function parameters schema directly.

Solution

Alright, figured it out. It appears you can use model-specific definitions of function tools. In the case of OpenAI it is resolved as follows:

const llm = new ChatOpenAI({ model: "gpt-3.5-turbo" });
  const llmWithTools = llm.bindTools([{
      type: "function",
      function: {
        name: "createChildNotionPageTool",
        description: `
            Creates new notion page inside parent page from provided input. 
            Accepts children as content of the page.
          `,
        parameters: zodToJsonSchema(parametersSchema, { $refStrategy: "none" }),
      },
  }]
);

The approach above allows full control over parameters field.