ffmpegh.264captions

Removing EIA-608 Closed Captions from H.264 without reencode


I'm looking to remove the closed captions (EIA-608) from a H.264 video (contained as mkv) without reencoding.

The closest I've got is using ffmpeg:

    ffmpeg -f lavfi -i movie=input.mkv[out+subcc] -map 0:0 output.mkv

In order to separate the video into rawvideo and subrip components and export out the rawvideo. However this results in a file close to 200GB, which isn't really a sustainable solution.

An ffmpeg based solution would be preferable, but I'm fine using whatever software is necessary.


Solution

  • This is actually possible using bitstream filters. As far as I know I discovered this myself, since everywhere I have looked this is supposed to be unsupported.

    The first thing to understand is that for EIA-608 and similar closed captioning standards, the captions are embedded directly in the video bitstream as user data. H.264 bitstreams are stored as a sequence of NAL (network abstraction layer) units. Each unit has a type; user data is stored in a NAL unit of the supplemental enhancement information (SEI) type.

    It turns out that ffmpeg has a bitstream filter called filter_units, which allows you to pass or reject NAL units by type. So we can use this to remove all the SEI NAL units, which strips out the captions.

    The filter documentation for filter_units says that we have to specify the types by number. According to the latest H.264 spec (Table 7-1), SEI units have type 6.

    So the following command will remove embedded closed captions:

    ffmpeg -i input.mkv -codec copy -bsf:v "filter_units=remove_types=6" output.mkv
    

    This has worked for me on several files without any problems or side effects.