I wrote a C# program to train an multi-classification model using Microsoft ML.NET. The training is successfully complete and I have exported the model as an ONNX file using the Microsoft.ML.OnnxConverter package.
I would like to consume the ONNX model from within a C++ program (running on x64-windows) for running the inference (prediction task).
The shape of the input and output in my model is:
Input:
Features: float 1x7
code_point: float 1x1
Output:
Features.output: float 1x7
code_point.output: float 1x1
PredictedLabel.output: float 1x1
Score.output: float 1x94
Note: The code_point
is uint32_t
datatype as noted in the answer. I am leaving the question as is with this note included.
In the code for invoking the inference,
constexpr size_t input_tensor_size = 8;
std::vector<float> input_tensor_values(input_tensor_size);
// initialize the input_tensor_values
...
// create input tensor object from data values
auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
auto input_tensor = Ort::Value::CreateTensor<float>(
memory_info,
input_tensor_values.data(), input_tensor_size,
input_node_dims.data(), input_node_dims.size());
std::vector<const char*> output_node_names = {
"Features.output", "code_point.output",
"PredictedLabel.output", "Score.output"
};
// score model & input tensor, get back output tensor
auto output_tensors =
session.Run(
Ort::RunOptions{ nullptr },
input_node_names.data(),
&input_tensor, 1,
output_node_names.data(), 1);
I am getting an access violation error upon invoking session.Run()
and I am not able to figure out what the cause is. I suspect it has to do with either the input tensor being flattened into a 1x8 vector and passed to the function or the output_node_names
shape length being passed as 1. I have tried setting that to 4 and that doesn't work either.
Could you please suggest the right sequence for initializing the tensors and calling the Run() function for the shape of the input/output given above?
I found out the mistake after @Botje pointed out the issue in a comment under the question.
First of all, there is a small error in the model. code_point
is uint32_t
datatype and not float. The correct model is
Input:
Features float 1x7,
code_point uint32_t 1x1
Output:
Features float 1x7,
code_point.output uint32_t 1x1,
PredictedLabel.output uint32_t 1x1,
Score.output float 1x94
Secondly, as @Botje pointed out, there are two inputs to the model viz., Features
and code_point
.
I created simple classes to hold the model input and output and pass it around:
struct model_input
{
public:
std::vector<float> features;
uint32_t code_point;
public:
model_input()
{
features.resize(7, 0.0f);
code_point = 0u;
}
};
struct model_output
{
public:
std::vector<float> features;
uint32_t code_point;
uint32_t PredictedLabel;
std::vector<float> Score;
public:
model_output()
{
features.resize(7, 0.0f);
code_point = 0u;
PredictedLabel = 0u;
Score.resize(94, 0.0f);
}
};
The working sequence for the initialization and inference is as follows:
// copy the test input values into "Features"
model_input mdl_input;
mdl_input.features = {
0.204244,
0.0475028,
-0.00872255,
-0.0037717,
-0.0122744,
0.0262117,
-0.000971803
};
mdl_input.code_point = 44u;
model_output mdl_output;
Ort::MemoryInfo memoryInfo = Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);
std::vector<Ort::Value> inputTensors;
//Features is float:7
inputTensors.push_back(Ort::Value::CreateTensor<float>(memoryInfo, mdl_input.features.data(), mdl_input.features.size(), inputDims[0].data(), inputDims[0].size()));
//code_point is uint32_t:1
inputTensors.push_back(Ort::Value::CreateTensor<uint32_t>(memoryInfo, &mdl_input.code_point, 1, inputDims[1].data(), inputDims[1].size()));
std::vector<Ort::Value> outputTensors;
// Features.output is float:7
outputTensors.push_back(Ort::Value::CreateTensor<float>(memoryInfo, mdl_output.features.data(), mdl_output.features.size(), outputDims[0].data(), outputDims[0].size()));
// code_point.output is uint32_t:1
outputTensors.push_back(Ort::Value::CreateTensor<uint32_t>(memoryInfo, &mdl_output.code_point, 1, outputDims[1].data(), outputDims[1].size()));
// PredictedLabel is uint32_t:1
outputTensors.push_back(Ort::Value::CreateTensor<uint32_t>(memoryInfo, &mdl_output.PredictedLabel, 1, outputDims[2].data(), outputDims[2].size()));
// Score is float:94
outputTensors.push_back(Ort::Value::CreateTensor<float>(memoryInfo, mdl_output.Score.data(), mdl_output.Score.size(), outputDims[3].data(), outputDims[3].size()));
// names are hard-coded!
std::vector<const char*> input_names_ptrs =
{
"Features",
"code_point"
};
std::vector<const char*> output_names_ptrs =
{
"Features.output",
"code_point.output",
"PredictedLabel.output",
"Score.output"
};
session.Run(
Ort::RunOptions{ nullptr },
input_names_ptrs.data(),
inputTensors.data(),
inputTensors.size(), //Number of inputs
output_names_ptrs.data(),
outputTensors.data(),
outputTensors.size() //Number of outputs
);
std::cout << "expected: " << mdl_input.code_point << ", predicted: " << mdl_output.code_point << std::endl;
After fixing this, the program generated the output:
expected: 44, predicted: 44