I'm using Intel hardware MFT to encode NV12 frames into H264 stream, and Live555 to stream the encoded frames through RTP over LAN and have ffplay setup at the other end to decode and display the same. The setup works just fine with software encoders (SYNC or ASYNC Software MFTs), but ffplay complains of the unavailability of SPS/PPS when encoding is done in Intel Hardware MFT and just displays a scrambled screen. I have figured out that Intel hardware encoders trigger the MF_E_TRANSFORM_STREAM_CHANGE event after feeding the initial sample and make SPS/PPS available via MF_MT_MPEG_SEQUENCE_HEADER. I'm able to catch that MF_E_TRANSFORM_STREAM_CHANGE event and get the sequence header blob.
The problem is, Live555 requires SPS and PPS to be set separately. But, I'm really confused about extracting SPS and PPS from the MF_MT_MPEG_SEQUENCE_HEADER blob.
As per my understanding, and further lookup in other threads, SPS and PPS start with 00 00 00 01 67 and 0 00 00 01 68 respectively. But, I don't find these sequences anywhere in the blob I received from the Intel encoder.
https://github.com/cisco/openh264/issues/756 start of SPS: 00 00 00 01 67 start of PPS: 00 00 00 01 68
Sequence header obtained from intel MFT:
Sequence header size 50
Squence Header: 0 0 1 27 64 0 28 ac 2b 40 3c 1 13 f2 e0 22 0 0 3 0 2 0 0 3 0 79 d0 80 f 42 0 3 d0 93 7b df 7 68 70 ca 80 0 0 0 1 28 ee 3c b0 0
vector<byte> sequenceHeaderData;
UINT32 sequenceHeaderDataSize = 0;
MFT_OUTPUT_DATA_BUFFER _outputDataBuffer;
memset(&_outputDataBuffer, 0, sizeof _outputDataBuffer);
_outputDataBuffer.dwStreamID = outputStreamID;
_outputDataBuffer.dwStatus = 0;
_outputDataBuffer.pEvents = nullptr;
_outputDataBuffer.pSample = nullptr;
HRESULT mftProcessOutput = _pEncoder->ProcessOutput(0, 1, &_outputDataBuffer, &processOutputStatus);
if (MF_E_TRANSFORM_STREAM_CHANGE == mftProcessOutput)
{
// some encoders want to renegotiate the output format.
if (_outputDataBuffer.dwStatus & MFT_OUTPUT_DATA_BUFFER_FORMAT_CHANGE)
{
CComPtr<IMFMediaType> pNewOutputMediaType = nullptr;
HRESULT res = _pEncoder->GetOutputAvailableType(outputStreamID, 1, &pNewOutputMediaType);
res = _pEncoder->SetOutputType(outputStreamID, pNewOutputMediaType, 0);//setting the type again
CHECK_HR(res, "Failed to set output type during stream change");
{
CComPtr<IMFMediaType> pCurOutputMediaType = nullptr;
HRESULT res = _pEncoder->GetOutputAvailableType(outputStreamID, 1, &pCurOutputMediaType);
res = pCurOutputMediaType->GetBlobSize(MF_MT_MPEG_SEQUENCE_HEADER, &sequenceHeaderDataSize);
if (SUCCEEDED(res) && sequenceHeaderDataSize > 0)
{
sequenceHeaderData.resize(sequenceHeaderDataSize);
pCurOutputMediaType->GetBlob(MF_MT_MPEG_SEQUENCE_HEADER, sequenceHeaderData.data(), sequenceHeaderDataSize, NULL);
cout << "Sequence header size " << sequenceHeaderDataSize << std::endl;
}
else
{
cout << "Sequence header is not available" << std::endl;
}
}
}
}
As per my understanding, and further lookup in other threads, SPS and PPS start with 00 00 00 01 67 and 0 00 00 01 68 respectively.
You assumed wrong.
From your sample header:
this is the SPS: 0 0 1 27 64 0 28 ac 2b 40 3c 1 13 f2 e0 22 0 0 3 0 2 0 0 3 0 79 d0 80 f 42 0 3 d0 93 7b df 7 68 70 ca 80
and this is the PPS: 0 0 0 1 28 ee 3c b0 0
SPS nalu type is defined as 7 in the last 5 bits of the first byte, after the start code. (not 67).
PPS nalu type is 8 in the last 5 bits of the first byte, after the start code, respectively (not 68).
Note: start code can contain only 3 bytes with values: 0 0 1 respectively.