I'm seeking advice on langchainjs
structured tools implementation.
I understand this is quite a mental stretch you're about to see, pardon me for that.
Problem Definition
Assume you have the following tool:
export class CreateChildNotionPageTool extends StructuredTool {
name = "createChildNotionPageTool";
description = `
Creates new notion page inside parent page from provided input.
Accepts children as content of the page.
`;
schema = createChildNotionPageToolSchema;
private api = new NotionAPIWrapper();
protected async _call({
page,
children,
}: CreatePagePayload): Promise<unknown> {
return this.api.createPage({ ...page, children });
}
}
Schema and argument type for this tool are defined as follows:
const createChildNotionPageToolSchema = z.object({
page: CreateChildPageSchema,
children: z.array(NotionBlockSchema).optional(),
});
type CreatePagePayload = z.infer<typeof createChildNotionPageToolSchema>;
The createChildNotionPageToolSchema
describes the complete set of possible objects (page, notion blocks, rich text definition, etc). The JSON schema ends up as a highly complex and enormous object with $ref
values to cross-reference objects in it.
Schema is combined from multiple files:
/schemas
--- pageProperties.schema.ts
--- baseBlock.schema.ts
--- notionPage.schema.ts
--- blocks
---- callout.schema.ts
---- paragraph.schema.ts
....
....
Whenever I try to use this tool together with ChatOpenAI
model:
const llm = new ChatOpenAI({ model: "gpt-3.5-turbo" }).bindTools([
createChildNotionPageTool
]);
const messages: any[] = [
new HumanMessage(
`Create new page "Brief Specification" with folder icon in parent page with id: 2655cfc1a73048cdb7e4dd4545d1d52d.
The page content is split by three headings: Scope, Description, Outcomes
Each heading section should contain autogenerated text.
`
),
];
const aiMessage: any = await llm.invoke(messages);
The call fails with the following message:
Invalid schema for function 'createNotionPageTool'. Please ensure it is a valid JSON Schema.
Debugging Done
The resulting JSON schema ended up with multiple $ref
values in it.
I figured that LLMs may not receive complex schemas as part of function tool calls well.
Particularly, in my understanding, OpenAI chat can't parse JSON schemas with cross-referenced ($ref
) values in it.
Luckily, there are libraries: zod-to-json-schema
and json-to-zod-schema
that can help you to debug it. My steps were the following:
zod-to-json-schema
tool with $refStrategy: none
configuration option. It will create
a schema out of zod definition and will not create cross-references
in it.json-to-zod-schema
package. It returned me a flattened zod structure
that is now converted into JSON schema without references inside of
Langchain's function tool execution.These steps allowed me to come up with a new schema that I could set in a Structured Tool that converts the request to llm correctly and the code runs successfuly.
But this poses another problem. The double conversion of a zod schema has to be done manually, and it is too large to be maintained manually further. It results in 4k rows of code of the schema alone:
line 1: z.object({
page: z.object(/*multiple zod objects here*/),
children: z.array(/* and even more here*/)
line 4363: })
After digging deeper, I found out that langchain's ChatOpenAI
implementation uses zod-to-json-schema
under the hood to convert zod to output that can be sent into llm. But there is no way to pass a configuration object for zod-to-json-schema
inside of the structure tool to tell it not to build JSON schema with $ref
cross-references from my original zod schema.
Finally, the Question:
Is there a way to enforce a certain way of creating JSON schema for a structured tool, in particular, to tell zod-to-json-schema
inside structured tool implementation to use configuration options, like $refStrategy: none
? Or, maybe, there are different ways to create a tool that is still reusable and can accept function parameters schema directly.
Alright, figured it out. It appears you can use model-specific definitions of function tools. In the case of OpenAI it is resolved as follows:
const llm = new ChatOpenAI({ model: "gpt-3.5-turbo" });
const llmWithTools = llm.bindTools([{
type: "function",
function: {
name: "createChildNotionPageTool",
description: `
Creates new notion page inside parent page from provided input.
Accepts children as content of the page.
`,
parameters: zodToJsonSchema(parametersSchema, { $refStrategy: "none" }),
},
}]
);
The approach above allows full control over parameters field.