video-capturems-media-foundationdesktop-duplication

Video creation with Microsoft Media Foundation and Desktop Duplication API


I'm using DDA for capturing the desktop image frames and sending them to the server, where these frames should be used to create video with MMF. I want to understand, what needs to be done with MMF, if i will use Source Reader and Sink Writer to render video from captured frames.

There are two questions:

1) Well, first of all, i can't fully understand is there, actually, need for the Source Reader with Media Source, if i already receive the video frames from DDA? Can i just send them to the Sink Writer and render the video?

2) As far as i understand, first thing to do, if there is still a need for Source Reader and Media Source, is write my own Media Source, which will understand the DXGI_FORMAT_B8G8R8A8_UNORM frames, that captured with DDA. Then i should use Souce Reader and Sink Writer with suitable Decoders\Encoders and send the media data to the Media Sinks. Could you, please, explain in more detail what needs to be done in this case?


Solution

  • Implementing SourceReader is not necessary in your case, but you can go ahead and implement it, it will work.

    Instead, you can also directly feed your input buffer captured through Desktop Duplication to SinkWriter. Just as below,

    CComPtr<IMFAttributes> attribs;
    CComPtr<IMFMediaSink>    m_media_sink;
    IMFSinkWriterPtr         m_sink_writer;
    
    MFCreateAttributes(&attribs, 0);
    attribs->SetUINT32(MF_LOW_LATENCY, TRUE);
    attribs->SetUINT32(MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS, TRUE);
    
    IMFMediaTypePtr mediaTypeOut = MediaTypeutput(fps, bit_rate);
    MFCreateFMPEG4MediaSink(stream, mediaTypeOut, nullptr, &m_media_sink));
    MFCreateSinkWriterFromMediaSink(m_media_sink, attribs, &m_sink_writer);
    
    //Set input media type
    mediaTypeIn->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB32);
    //Set output media type
    mediaTypeOut->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264);
    
    IMFSamplePtr sample;
    MFCreateSample(&sample);
    sample->AddBuffer(m_buffer); // m_buffer is source buffer in R8G8B8A8 format
    
    sample->SetSampleTime(m_time_stamp);
    sample->SetSampleDuration(m_frame_duration);
    m_sink_writer->WriteSample(m_stream_index, sample);
    

    Here is a perfectly working sample based on SinkWriter. It supports both network and file sink. It actually captures the desktop through GDI approach though. DDA is almost the same, you can indeed obtain better performance using DDA.

    I have also uploaded one more sample here which is in fact based on Desktop duplication, and directly uses IMFTransform instead, and streams the output video as RTP stream using Live555. I'm able to achieve up to 100FPS through this approach.

    If you decide to follow the SinkWriter approach, you don't have to worry about the color conversion part as it is taken care by SinkWriter under the hood. And with IMFTransform, you will have to deal with color conversion part, but you will have a fine grained control over the encoder.

    Here are some more reference links for you.

    1. https://github.com/ashumeow/webrtc4all/blob/master/gotham/MFT_WebRTC4All/test/test_encoder.cc
    2. DXGI Desktop Duplication: encoding frames to send them over the network
    3. Getting green screen in ffplay: Streaming desktop (DirectX surface) as H264 video over RTP stream using Live555
    4. Intel graphics hardware H264 MFT ProcessInput call fails after feeding few input samples, the same works fine with Nvidia hardware MFT
    5. Color conversion from DXGI_FORMAT_B8G8R8A8_UNORM to NV12 in GPU using DirectX11 pixel shaders
    6. GOP setting is not honored by Intel H264 hardware MFT
    7. Encoding a D3D Surface obtained through Desktop Duplication using Media Foundation