My production environment loads tools, it contains my functions schema, it's important because that's the response is calling for the tools, so I can't really eval a response when the eval doesn't have the tools.
I'm trying to create an eval that actually has my tools, evaluating the raw model is irrelevant, it must have the same setup as my production env and my production env has tools.
Here is my production payload:
const response = await openai.responses.create({
model: "gpt-4.1-2025-04-14",
input: input,
instructions: systemInstructions,
tools: tools,
store: true
});
I need to be able to run eval with those tools, and I need to be able to run my own data, my own list of questions.
I understand there is store_completion
where you can check your logs, that's great but for this task I just want to evaluate with tools and have my own data.
IS it possible to create an eval in the dash with my tools?
At this time there does not appear to be a way to add tools, in my case vector store and functions schema, in either the dash or curl command, for evals with openai API. You get the model plus prompts. That's it.
I went through every option on the dash, I tried endless variations of curl payloads, I've read the docs where it lists the parameters. There is nothing for tools.