c++opencvmaskimage-segmentationtensorflow-lite

Convert Tensorflow Lite output into an image C++


I have created a model for image segmentation in Python with the help of Tensorflow, which provides me with a mask as output.

In Python I can simply load the model over two lines and generate an output (as input I use greyscale images of size 128x128, which are passed in the form of a batch: [1,128,128,1]).

model.load_weights(path/to/model)
test_preds = model.predict(X_test)

model output image

The next step is to carry out a binarisation, which gives me a mask consisting only of the values 255 or 0.

preds_test_thresh = (test_preds >= 0.5).astype(np.uint8)
test_img = preds_test_thresh[1, :, :, 0]

output image after thresholding

My aim now is to use this model in C++. To do this, I first converted my model into a TF-Lite model and would now like to load this into C++ and generate an output.

My approch to that are the follow:

// Create model from file
    auto model = tflite::FlatBufferModel::BuildFromFile("path/to/model");
    if (model == nullptr)
        wxLogMessage("Model not loaded");
    else
        wxLogMessage("Model loaded");

    // Create an Interpreter with an InterpreterBuilder.
    std::unique_ptr<Interpreter> interpreter;
    tflite::ops::builtin::BuiltinOpResolver resolver;
    tflite::InterpreterBuilder(*model, resolver)(&interpreter);
    if (!interpreter)
        wxLogMessage("Interpreter not loaded");
    

    if (interpreter->AllocateTensors() != kTfLiteOk)
        wxLogMessage("Allocation failed");
    else
        wxLogMessage("Allocation success");

    // load image; get blue channel; resize to 128x128
    std::string image_path = samples::findFile("path/to/image");
    cv::Mat img = cv::imread(image_path);

    wxLogMessage(wxString::Format("%d x %d x %d", img.size[1], img.size[0], img.channels()));

    Mat bgr[3];
    split(img, bgr);

    Mat channelImg = bgr[0];

    Mat inputImg;

    channelImg.convertTo(inputImg, CV_32FC1, 1.0 / 255.0);

    cv::resize(inputImg, inputImg, cv::Size(128, 128));

    wxLogMessage(wxString::Format("%d x %d x %d", inputImg.size[1], inputImg.size[0], inputImg.channels()));

    // Fill input buffer
    float* input = interpreter->typed_input_tensor<float>(0);
    memcpy(input, inputImg.data, 128 * 128 * sizeof(float));

    // invoke interpreter
    if (interpreter->Invoke() != kTfLiteOk) {
        wxLogMessage("Failed to invoke");
    }

    // get output
    float* output = interpreter->typed_output_tensor<float>(0);

I took the ideas for the C++ code from various examples and obtained float values. However, I do not get an image as output, which is why I have already tried some methods of generating a Mat object, but unfortunately I did not get a correct output image.

So my question now is, how do I generate an image from the output of the model (as above in Python) with which I can continue working? Or do I have to change something in the C++ code above to get an output?


Solution

  • I solved the problem by myself, with the following code:

    // File path to the TensorFlow Lite model (.tflite)
    const char* model_path = "pat";
    
    // Load the TensorFlow Lite model
    auto model = tflite::FlatBufferModel::BuildFromFile(model_path);
    std::unique_ptr<tflite::Interpreter> interpreter;
    tflite::ops::builtin::BuiltinOpResolver resolver;
    tflite::InterpreterBuilder builder(*model, resolver);
    builder(&interpreter);
    
    // Check whether the interpreter has been successfully created
    if (!interpreter)
        wxLogMessage("Interpreter not loaded");
    
    // Assign TensorFlow Lite model
    interpreter->AllocateTensors();
    
    // Resize image to fit the model input [128x128x1]
    const int image_width = 128;
    const int image_height = 128;
    
    cv::Mat input_image = cv::imread("path/to/image");
    
    Mat bgr[3];
    split(input_image, bgr);
    input_image = bgr[0];
    
    if (input_image.empty())
    {
        wxLogMessage(wxString::Format("Could not read the image: %s", "path/to/image"));
    }
    
    cv::resize(input_image, input_image, cv::Size(image_width, image_height));
    
    imshow("Display window", input_image);
    waitKey(0);
    
    // Pointer to the input tensor of the interpreter
    float* input_tensor_data = interpreter->typed_input_tensor<float>(0);
    
    // Copy the image pixels into the input tensor
    for (int y = 0; y < image_height; ++y) {
        for (int x = 0; x < image_width; ++x) {
            input_tensor_data[y * image_width + x] = static_cast<float>(input_image.at<uchar>(y, x));
        }
    }
    
    // Run the model
    interpreter->Invoke();
    
    int output_tensor_count = interpreter->outputs().size();
    for (int i = 0; i < output_tensor_count; ++i) {
        int output_tensor_index = interpreter->outputs()[i];
        TfLiteIntArray* output_dims = interpreter->tensor(output_tensor_index)->dims;
    }
    
    int output_tensor_index = 0;
    TfLiteTensor* output_tensor = interpreter->tensor(output_tensor_index);
    
    // output image with size of [128x128x1]
    const int output_image_width = 128;
    const int output_image_height = 128;
    
    // Pointer to the output sensor data
    float* output_data = interpreter->typed_output_tensor<float>(output_tensor_index);
    
    cv::Mat output_image(output_image_height, output_image_width, CV_8UC1);
    
    for (int y = 0; y < output_image_height; ++y) {
        for (int x = 0; x < output_image_width; ++x) {
            output_image.at<uchar>(y, x) = static_cast<uchar>(output_data[y * output_image_width + x] * 255.0);
        }
    }
    
    imshow("Display window", output_image);
    waitKey(0);
    

    The two main problems were on the one hand filling the input tensor and on the other hand reading the output tensor correctly and filling it.

    Fill input tensor:

    // Copy the image pixels into the input tensor
    for (int y = 0; y < image_height; ++y) {
        for (int x = 0; x < image_width; ++x) {
            input_tensor_data[y * image_width + x] = static_cast<float>(input_image.at<uchar>(y, x));
        }
    }
    

    Define output tensor and fill it pixelwise:

    cv::Mat output_image(output_image_height, output_image_width, CV_8UC1);
    
    for (int y = 0; y < output_image_height; ++y) {
        for (int x = 0; x < output_image_width; ++x) {
            output_image.at<uchar>(y, x) = static_cast<uchar>(output_data[y * output_image_width + x] * 255.0);
        }
    }
    

    My error was initially when reading out the output tensor. My greyscale input image was in the range of [0,255], whereas my model output a float vector in the range of [0,1]. Because of this, the pixels have to be calculated *255 when filling the output image, which I had not taken into account.