I'm using MediaCodec to trim a video providing a startAt and endAt time where the trim happens.
when I use:
extractor.seekTo(startMs * 1000.toLong(), MediaExtractor.SEEK_TO_CLOSEST_SYNC)
I don't get any black frames but there is a slight margin between startAt and the actual startAt When the trim is done, let's say I set startAt to be 5_000
, the result I get is 4_500
.
Now to rectify that I tried using:
val seekToTime = startMs * 1_000
extractor.seekTo(seekToTime, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
// Discard samples until reaching the exact start time
while (extractor.sampleTime < seekToTime) {
extractor.advance()
}
using this I get an extremely close trim but there are black frames at the beginning.
Here is the full code below:
suspend fun editVideo(videoFilePath: String, mediaEdit: KPMediaEdit) = withContext(Dispatchers.IO) {
val startMs: Long = ((mediaEdit.trims.firstOrNull()?.startAt ?: 0.0) * 1_000).toLong()
val endMs: Long = ((mediaEdit.trims.firstOrNull()?.endAt ?: 0.0) * 1_000).toLong()
val useAudio = mediaEdit.hasSound
val useVideo = true
pDebug("UseAudio = $useAudio - startMs = $startMs - endMs = $endMs", "EditVideo")
// Input and Output files setup remains the same
val inputFile = File(videoFilePath)
val outputFilePath = inputFile.absolutePath.replace(".mp4", "_edited.mp4")
val outputFile = File(outputFilePath)
val extractor = MediaExtractor()
extractor.setDataSource(videoFilePath)
val trackCount = extractor.trackCount
val muxer = MediaMuxer(outputFilePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)
// Set up the tracks and retrieve the max buffer size for selected
// tracks.
val indexMap = HashMap<Int, Int>(trackCount)
var bufferSize = -1
for (i in 0 until trackCount) {
val format = extractor.getTrackFormat(i)
val mime = format.getString(MediaFormat.KEY_MIME)
var selectCurrentTrack = false
if (mime?.startsWith("audio/") == true && useAudio) {
selectCurrentTrack = true
} else if (mime?.startsWith("video/") == true && useVideo) {
selectCurrentTrack = true
}
if (selectCurrentTrack) {
extractor.selectTrack(i)
val dstIndex = muxer.addTrack(format)
indexMap[i] = dstIndex
if (format.containsKey(MediaFormat.KEY_MAX_INPUT_SIZE)) {
val newSize = format.getInteger(MediaFormat.KEY_MAX_INPUT_SIZE)
bufferSize = if (newSize > bufferSize) newSize else bufferSize
}
}
}
if (bufferSize < 0) {
bufferSize = DEFAULT_BUFFER_SIZE
}
// Set up the orientation and starting time for the extractor.
val retrieverSrc = MediaMetadataRetriever()
try {
retrieverSrc.setDataSource(videoFilePath)
} catch (e: Exception) {
pDebug(e.toString(), "EditVideo")
}
val degreesString =
retrieverSrc.extractMetadata(MediaMetadataRetriever.METADATA_KEY_VIDEO_ROTATION)
if (!degreesString.isNullOrEmpty()) {
val degrees = Integer.parseInt(degreesString)
if (degrees >= 0) {
muxer.setOrientationHint(degrees)
}
}
if (startMs > 0) {
val seekToTime = startMs * 1_000
extractor.seekTo(seekToTime, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
// Discard samples until reaching the exact start time
while (extractor.sampleTime < seekToTime) {
extractor.advance()
}
pDebug("Track: SeekTo: ${startMs * 1000}, ${extractor.sampleTime}", "EditVideo")
}
//val newStartAt = extractor.sampleTime
// Copy the samples from MediaExtractor to MediaMuxer. We will loop
// for copying each sample and stop when we get to the end of the source
// file or exceed the end time of the trimming.
val offset = 0
var trackIndex = -1
val dstBuf = ByteBuffer.allocate(bufferSize)
val bufferInfo = MediaCodec.BufferInfo()
// Calculate the total duration and trimming duration for progress reporting
val totalDuration = endMs - startMs
val trimmingDuration = totalDuration.toDouble()
try {
muxer.start()
while (true) {
bufferInfo.offset = offset
bufferInfo.size = extractor.readSampleData(dstBuf, offset)
if (bufferInfo.size < 0) {
pDebug("Trim - Saw input EOS", "EditVideo")
bufferInfo.size = 0
break
}
bufferInfo.presentationTimeUs = extractor.sampleTime
if (endMs > 0 && bufferInfo.presentationTimeUs > endMs * 1000.toLong()) {
pDebug("Trim - The current sample is over the trim end time", "EditVideo")
break
}
bufferInfo.flags = extractor.sampleFlags
trackIndex = extractor.sampleTrackIndex
muxer.writeSampleData(indexMap[trackIndex]!!, dstBuf, bufferInfo)
extractor.advance()
// Calculate progress and invoke callback if needed
val currentProgress = ((bufferInfo.presentationTimeUs / 1000).toInt() - startMs).toDouble()
val progressPercentage = (currentProgress / trimmingDuration * 100).toInt()
//progressCallback.invoke(progressPercentage)
}
muxer.stop()
// Call the progress callback with 100 to indicate completion
//progressCallback.invoke(100)
} catch (e: IllegalStateException) {
// Swallow the exception due to malformed source.
pDebug("Trim - The source video file is malformed", "EditVideo")
//errorCallback(e)
} finally {
extractor.release()
retrieverSrc.release()
muxer.release()
}
outputFile.renameTo(inputFile)
}
DEFAULT_BUFFER_SIZE = 1 * 1024 * 1024
How do I get as close as possible to my desired startAt time? is it possible to fix the black frames?
According to your code, you are copying encoded video frames from an input file to an output file.
Note that a video stream consists of different types of frames. I-frames contain complete images, while P-frames and B-frames are partial.
If you seek the MediaExtractor
to the nearest I-frame using SEEK_TO_CLOSEST_SYNC
and then skip it because the frame time is too early, you are actually throwing away the information required for decoding the following partial frames.
You basically have a couple of options:
MediaCodec
, ignore those decoded frames that are not within the desired time range, and encode the rest of the frames back using another MediaCodec
(this workflow will be slow but accurate)