Essentially, I wish to concatenate a series of videos using MP4Box. When I attempt to do so, I receive the following error:
No suitable destination track found - creating new one (type soun)
0.500 secs Interleaving
I can circumvent the issue, at least temporarily, by adding a -force-cat
parameter to the MP4Box
command. However, this creates issues with the alignment of audio and video and produces the following warning:
Concatenating track ID 1 even though sample descriptions do not match
Now, as far as I can tell, this has to do with differing parameters between video types. I will display the ffprobe
output of each video type below in order to hopefully shed some light on the issue.
VIDEO TYPE 1 FFPROBE OUTPUT:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Duration: 00:00:02.25, start: 0.000000, bitrate: 851 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 816 kb/s, 4 fps, 4 tbr, 16384 tbn, 8 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: mp3 (mp4a / 0x6134706D), 24000 Hz, mono, fltp, 32 kb/s (default)
Metadata:
handler_name : SoundHandler
VIDEO TYPE 2 FFPROBE OUTPUT:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'static.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.78.100
Duration: 00:00:01.00, start: 0.000000, bitrate: 662 kb/s
Stream #0:0(eng): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 654 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Can anyone identify what the incongruity between video types is and how I can resolve it? Thanks.
File attributes must match, but they are different. See a list of attributes that must match for proper concatenation.
Important incongruities:
This method is good if you need to add a short segment to a long video. It will leave the long video untouched and therefore will preserve the quality and it will be fast. Downside is that you have to make sure all of the attributes match which can be difficult if you are unfamiliar with this topic.
Example to make static.mp4
like 0.mp4
, using anullsrc filter to generate blank/silent/dummy/filler audio.
Re-encode:
ffmpeg -i static.mp4 -f lavfi -i anullsrc=channel_layout=mono:sample_rate=24000 -c:v libx264 -c:a libmp3lame -video_track_timescale 16384 -shortest 1.mp4
Make input.txt
containing:
file '0.mp4'
file '1.mp4'
Concatenate with the concat demuxer:
ffmpeg -f concat -i input.txt -c copy output.mp4
This method uses multiple filters to conform all of the inputs to a common set of parameters (frame rate, width, height, etc). This is most useful if your inputs are always varied or arbitrary. It does everything in one command. Downside is that it re-encodes everything and might be slow.
See How to concatenate videos in ffmpeg with different attributes? for many examples.