androidkotlinandroid-mediacodec

Using MediaCodec to trim causes black frames when advancing encoder to reach newStartAt time


I'm using MediaCodec to trim a video providing a startAt and endAt time where the trim happens. when I use: extractor.seekTo(startMs * 1000.toLong(), MediaExtractor.SEEK_TO_CLOSEST_SYNC) I don't get any black frames but there is a slight margin between startAt and the actual startAt When the trim is done, let's say I set startAt to be 5_000, the result I get is 4_500.

Now to rectify that I tried using:

val seekToTime = startMs * 1_000
extractor.seekTo(seekToTime, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
// Discard samples until reaching the exact start time
while (extractor.sampleTime < seekToTime) {
   extractor.advance()
}

using this I get an extremely close trim but there are black frames at the beginning.

Here is the full code below:

suspend fun editVideo(videoFilePath: String, mediaEdit: KPMediaEdit) = withContext(Dispatchers.IO) {
        val startMs: Long = ((mediaEdit.trims.firstOrNull()?.startAt ?: 0.0) * 1_000).toLong()
        val endMs: Long = ((mediaEdit.trims.firstOrNull()?.endAt ?: 0.0) * 1_000).toLong()

        val useAudio = mediaEdit.hasSound
        val useVideo = true

        pDebug("UseAudio = $useAudio - startMs = $startMs - endMs = $endMs", "EditVideo")

        // Input and Output files setup remains the same
        val inputFile = File(videoFilePath)
        val outputFilePath = inputFile.absolutePath.replace(".mp4", "_edited.mp4")
        val outputFile = File(outputFilePath)

        val extractor = MediaExtractor()
        extractor.setDataSource(videoFilePath)
        val trackCount = extractor.trackCount
        val muxer = MediaMuxer(outputFilePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)

        // Set up the tracks and retrieve the max buffer size for selected
        // tracks.
        val indexMap = HashMap<Int, Int>(trackCount)
        var bufferSize = -1
        for (i in 0 until trackCount) {
            val format = extractor.getTrackFormat(i)
            val mime = format.getString(MediaFormat.KEY_MIME)
            var selectCurrentTrack = false
            if (mime?.startsWith("audio/") == true && useAudio) {
                selectCurrentTrack = true
            } else if (mime?.startsWith("video/") == true && useVideo) {
                selectCurrentTrack = true
            }
            if (selectCurrentTrack) {
                extractor.selectTrack(i)
                val dstIndex = muxer.addTrack(format)
                indexMap[i] = dstIndex
                if (format.containsKey(MediaFormat.KEY_MAX_INPUT_SIZE)) {
                    val newSize = format.getInteger(MediaFormat.KEY_MAX_INPUT_SIZE)
                    bufferSize = if (newSize > bufferSize) newSize else bufferSize
                }
            }
        }
        if (bufferSize < 0) {
            bufferSize = DEFAULT_BUFFER_SIZE
        }
        // Set up the orientation and starting time for the extractor.
        val retrieverSrc = MediaMetadataRetriever()

        try {
            retrieverSrc.setDataSource(videoFilePath)
        } catch (e: Exception) {
            pDebug(e.toString(), "EditVideo")
        }
        val degreesString =
            retrieverSrc.extractMetadata(MediaMetadataRetriever.METADATA_KEY_VIDEO_ROTATION)
        if (!degreesString.isNullOrEmpty()) {
            val degrees = Integer.parseInt(degreesString)
            if (degrees >= 0) {
                muxer.setOrientationHint(degrees)
            }
        }

        if (startMs > 0) {
            val seekToTime = startMs * 1_000
            extractor.seekTo(seekToTime, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
            // Discard samples until reaching the exact start time
            while (extractor.sampleTime < seekToTime) {
                extractor.advance()
            }
            pDebug("Track: SeekTo: ${startMs * 1000}, ${extractor.sampleTime}", "EditVideo")
        }
        //val newStartAt = extractor.sampleTime
        // Copy the samples from MediaExtractor to MediaMuxer. We will loop
        // for copying each sample and stop when we get to the end of the source
        // file or exceed the end time of the trimming.
        val offset = 0
        var trackIndex = -1
        val dstBuf = ByteBuffer.allocate(bufferSize)
        val bufferInfo = MediaCodec.BufferInfo()

        // Calculate the total duration and trimming duration for progress reporting
        val totalDuration = endMs - startMs
        val trimmingDuration = totalDuration.toDouble()

        try {
            muxer.start()

            while (true) {
                bufferInfo.offset = offset
                bufferInfo.size = extractor.readSampleData(dstBuf, offset)
                if (bufferInfo.size < 0) {
                    pDebug("Trim - Saw input EOS", "EditVideo")
                    bufferInfo.size = 0
                    break
                }
                bufferInfo.presentationTimeUs = extractor.sampleTime
                if (endMs > 0 && bufferInfo.presentationTimeUs > endMs * 1000.toLong()) {
                    pDebug("Trim - The current sample is over the trim end time", "EditVideo")
                    break
                }
                bufferInfo.flags = extractor.sampleFlags
                trackIndex = extractor.sampleTrackIndex
                muxer.writeSampleData(indexMap[trackIndex]!!, dstBuf, bufferInfo)
                extractor.advance()

                // Calculate progress and invoke callback if needed
                val currentProgress = ((bufferInfo.presentationTimeUs / 1000).toInt() - startMs).toDouble()
                val progressPercentage = (currentProgress / trimmingDuration * 100).toInt()
                //progressCallback.invoke(progressPercentage)
            }
            muxer.stop()

            // Call the progress callback with 100 to indicate completion
            //progressCallback.invoke(100)
        } catch (e: IllegalStateException) {
            // Swallow the exception due to malformed source.
            pDebug("Trim - The source video file is malformed", "EditVideo")
            //errorCallback(e)
        } finally {
            extractor.release()
            retrieverSrc.release()
            muxer.release()
        }

        outputFile.renameTo(inputFile)
    }

DEFAULT_BUFFER_SIZE = 1 * 1024 * 1024

How do I get as close as possible to my desired startAt time? is it possible to fix the black frames?


Solution

  • According to your code, you are copying encoded video frames from an input file to an output file.

    Note that a video stream consists of different types of frames. I-frames contain complete images, while P-frames and B-frames are partial.

    If you seek the MediaExtractor to the nearest I-frame using SEEK_TO_CLOSEST_SYNC and then skip it because the frame time is too early, you are actually throwing away the information required for decoding the following partial frames.

    You basically have a couple of options:

    1. Live with the fact that the result it not 100% accurate (compensated by the fact that the process is fast)
    2. Decode the video frames starting from the closest I-frame using one MediaCodec, ignore those decoded frames that are not within the desired time range, and encode the rest of the frames back using another MediaCodec (this workflow will be slow but accurate)