videoffmpegmp4codecmp4box

MP4Box: Concatenating track ID 1 even though sample descriptions do not match


Essentially, I wish to concatenate a series of videos using MP4Box. When I attempt to do so, I receive the following error:

No suitable destination track found - creating new one (type soun)
0.500 secs Interleaving 

I can circumvent the issue, at least temporarily, by adding a -force-cat parameter to the MP4Box command. However, this creates issues with the alignment of audio and video and produces the following warning:

Concatenating track ID 1 even though sample descriptions do not match

Now, as far as I can tell, this has to do with differing parameters between video types. I will display the ffprobe output of each video type below in order to hopefully shed some light on the issue.

VIDEO TYPE 1 FFPROBE OUTPUT:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '0.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.29.100
  Duration: 00:00:02.25, start: 0.000000, bitrate: 851 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 816 kb/s, 4 fps, 4 tbr, 16384 tbn, 8 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: mp3 (mp4a / 0x6134706D), 24000 Hz, mono, fltp, 32 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

VIDEO TYPE 2 FFPROBE OUTPUT:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'static.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.78.100
  Duration: 00:00:01.00, start: 0.000000, bitrate: 662 kb/s
    Stream #0:0(eng): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 654 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler

Can anyone identify what the incongruity between video types is and how I can resolve it? Thanks.


Solution

  • Problem

    File attributes must match, but they are different. See a list of attributes that must match for proper concatenation.

    Important incongruities:

    Solution 1: Re-encode one to match the other

    This method is good if you need to add a short segment to a long video. It will leave the long video untouched and therefore will preserve the quality and it will be fast. Downside is that you have to make sure all of the attributes match which can be difficult if you are unfamiliar with this topic.

    Example to make static.mp4 like 0.mp4, using anullsrc filter to generate blank/silent/dummy/filler audio.

    1. Re-encode:

      ffmpeg -i static.mp4 -f lavfi -i anullsrc=channel_layout=mono:sample_rate=24000 -c:v libx264 -c:a libmp3lame -video_track_timescale 16384 -shortest 1.mp4
      
    2. Make input.txt containing:

      file '0.mp4'
      file '1.mp4'
      
    3. Concatenate with the concat demuxer:

      ffmpeg -f concat -i input.txt -c copy output.mp4
      

    Solution 2: Re-encode everything

    This method uses multiple filters to conform all of the inputs to a common set of parameters (frame rate, width, height, etc). This is most useful if your inputs are always varied or arbitrary. It does everything in one command. Downside is that it re-encodes everything and might be slow.

    See How to concatenate videos in ffmpeg with different attributes? for many examples.