How do I create a TensorFloat16Bit
when manually doing a tensorization of the data?
We tensorized our data based on this Microsoft example, where we are converting 255-0 to 1-0, and changing the RGBA order.
...
std::vector<int64_t> shape = { 1, channels, height , width };
float* pCPUTensor;
uint32_t uCapacity;
// The channels of image stored in buffer is in order of BGRA-BGRA-BGRA-BGRA.
// Then we transform it to the order of BBBBB....GGGGG....RRRR....AAAA(dropped)
TensorFloat tf = TensorFloat::Create(shape);
com_ptr<ITensorNative> itn = tf.as<ITensorNative>();
CHECK_HRESULT(itn->GetBuffer(reinterpret_cast<BYTE**>(&pCPUTensor), &uCapacity));
// 2. Transform the data in buffer to a vector of float
if (BitmapPixelFormat::Bgra8 == pixelFormat)
{
for (UINT32 i = 0; i < size; i += 4)
{
// suppose the model expects BGR image.
// index 0 is B, 1 is G, 2 is R, 3 is alpha(dropped).
UINT32 pixelInd = i / 4;
pCPUTensor[pixelInd] = (float)pData[i];
pCPUTensor[(height * width) + pixelInd] = (float)pData[i + 1];
pCPUTensor[(height * width * 2) + pixelInd] = (float)pData[i + 2];
}
}
I just converted our .onnx
model to float16 to verify if that would provide some performance improvements on the inference when the available hardware provides support for float16. However, the binding is failing and the suggestion here is to pass a TensorFloat16Bit
.
So if I swap the TensorFloat for TensorFloat16Bit I get an access violation exception at pCPUTensor[(height * width * 2) + pixelInd] = (float)pData[i + 2];
because pCPUTensor
is half of the size of what it was. It seems like I should be reinterpreting_cast to uint16_t**
or something among those lines, so pCPUTensor
will have the same size as when it was a TensorFloat, but then I get further errors that it can only be uint8_t**
or BYTE**
.
Any ideas on how I can modify this code so I can get a custom TensorFloat16Bit?
Try the factory methods on TensorFloat16Bit
.
However, you will need to convert you data to float16
:
https://stackoverflow.com/a/60047308/11998382
Also, I might recommend you instead do the conversion within the onnx model.