I am trying to create my first demo AI project, and I would like to use Semantic Kernel to connect with my local hosted LLM and use function calls. I am getting responses from the LLM just fine, but I am unable to invoke any of my plugins. I can see that plugins are inserted into json when sending prompt to the LLM like so:
"tools": [
{
"type": "function",
"function": {
"description": "Gets a list of lights and their current state",
"name": "Lights-get_lights",
"parameters": {
"type": "object",
"required": [],
"properties": {}
}
}
},
but sadly I am unable to make LLM call my function via Kernel, to fetch data from my plugin. I have tried few models such as: Llama-3.2-3B-Instruct, Phi-3-mini-4k-instruct, Hermes-2-Pro-Mistral-7B, c4ai-command-r-v01. I am using LM Studio to host local server. I am pretty new to LLMs so perhaps I am missing something obvious?
Here is my Program.cs code:
var messageAPIPlatform = "LocalAI";
var url = "http://localhost:1234/v1";
var modelId = "llama-3.2-3b-instruct";
Console.WriteLine($"Example using local {messageAPIPlatform}");
#pragma warning disable SKEXP0010
var kernel = Kernel.CreateBuilder()
.AddOpenAIChatCompletion(
modelId: modelId,
apiKey: null,
endpoint: new Uri(url))
.Build();
var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
kernel.Plugins.AddFromType<LightsPlugin>("Lights");
// Enable planning
#pragma warning disable SKEXP0001
OpenAIPromptExecutionSettings settings = new OpenAIPromptExecutionSettings()
{
ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
Temperature = .1,
ChatSystemPrompt = """
Assistant is a large language model.
This assistant uses plugins to interact with the software.
"""
};
// Create a history store the conversation
var history = new ChatHistory();
// Initiate a back-and-forth chat
string? userInput;
do
{
// Collect user input
Console.Write("User > ");
userInput = Console.ReadLine();
// Add user input
history.AddUserMessage(userInput);
// Get the response from the AI
var result = await chatCompletionService.GetChatMessageContentAsync(
history,
executionSettings: settings,
kernel: kernel);
// Print the results
Console.WriteLine("Assistant > " + result);
// Add the message from the agent to the chat history
history.AddMessage(result.Role, result.Content ?? string.Empty);
} while (userInput is not null);
and my plugin:
public class LightsPlugin
{
// Mock data for the lights
private readonly List<LightModel> lights = new()
{
new LightModel { Id = 1, Name = "Table Lamp", IsOn = false },
new LightModel { Id = 2, Name = "Porch light", IsOn = false },
new LightModel { Id = 3, Name = "Chandelier", IsOn = true }
};
[KernelFunction("get_lights")]
[Description("Gets a list of lights and their current state")]
[return: Description("An array of lights")]
public async Task<List<LightModel>> GetLightsAsync()
{
return lights;
}
[KernelFunction("change_state")]
[Description("Changes the state of the light")]
[return: Description("The updated state of the light; will return null if the light does not exist")]
public async Task<LightModel?> ChangeStateAsync(int id, bool isOn)
{
var light = lights.FirstOrDefault(light => light.Id == id);
if (light == null)
{
return null;
}
// Update the light with the new state
light.IsOn = isOn;
return light;
}
}
Even though I can see that this plugin is being send to the LLM as a tool. LLM is unable to call this GetLigthsAsync() method when asked about the current state of lights. I am getting an general answer that LLM is not able to answer this question
I tried using solution from this sample implementation of Ollama with function calling sadly without any success. I am using the latest versions of Semantic Kernel and Ollama.
I was finally able to invoke function calling with Ollama. Main changes are:
This solution is not perfect, since I am using old version of SemanticKernel and it's much different than any official guide. However this in the only way I was able to invoke my methods via LLM until now. A solution below is base on this article. I am still wondering why anything else does not work.
Here is the code:
var modelId = "llama3.2";
var baseUrl = "http://localhost:11434";
var httpClient = new HttpClient
{
Timeout = TimeSpan.FromMinutes(2)
};
var builder = Kernel.CreateBuilder().AddOpenAIChatCompletion(modelId: modelId!, apiKey: null, endpoint: new Uri(baseUrl!), httpClient: httpClient);
var kernel = builder.Build();
var hostName = "AI Assistant";
var hostInstructions =
@"You are a friendly assistant";
var settings = new OpenAIPromptExecutionSettings()
{
Temperature = 0.1,
ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
};
ChatCompletionAgent agent =
new()
{
Instructions = hostInstructions,
Name = hostName,
Kernel = kernel,
Arguments = new(settings),
};
KernelPlugin lightsPlugin = KernelPluginFactory.CreateFromType<LightsPlugin>();
agent.Kernel.Plugins.Add(lightsPlugin);
Console.ForegroundColor = ConsoleColor.Green;
Console.WriteLine("Assistant: Hello, I am your Assistant. How may i help you?");
AgentGroupChat chat = new();
while (true)
{
Console.ForegroundColor = ConsoleColor.White;
Console.Write("User: ");
await InvokeAgentAsync(Console.ReadLine()!);
}
async Task InvokeAgentAsync(string question)
{
chat.AddChatMessage(new ChatMessageContent(AuthorRole.User, question));
Console.ForegroundColor = ConsoleColor.Green;
await foreach (ChatMessageContent content in chat.InvokeAsync(agent))
{
Console.WriteLine(content.Content);
}
}
#pragma warning restore SKEXP0010
#pragma warning restore SKEXP0110
Function calling for local models isn't officially supported when using the OpenAI connector via AddOpenAIChatCompletion
.
For more details, see this github issue.
Instead, use the OllamaConnector with AddOllamaChatCompletion()
for local function calling.
Here’s a sample implementation of Ollama with function calling for reference.
Note: This may require an updated version of the Semantic Kernel(As of now
dotnet-1.31.0
is the latest version).