I wish to make use of libx264's low-latency encoding mechanism, whereby a user-provided callback is called as soon as a single NAL unit is available instead of having to wait for a whole frame to be encoded before starting processing.
The x264 documentation states the following about that facility:
/* Optional low-level callback for low-latency encoding. Called for each output NAL unit
* immediately after the NAL unit is finished encoding. This allows the calling application
* to begin processing video data (e.g. by sending packets over a network) before the frame
* is done encoding.
*
* This callback MUST do the following in order to work correctly:
* 1) Have available an output buffer of at least size nal->i_payload*3/2 + 5 + 64.
* 2) Call x264_nal_encode( h, dst, nal ), where dst is the output buffer.
* After these steps, the content of nal is valid and can be used in the same way as if
* the NAL unit were output by x264_encoder_encode.
*
* This does not need to be synchronous with the encoding process: the data pointed to
* by nal (both before and after x264_nal_encode) will remain valid until the next
* x264_encoder_encode call. The callback must be re-entrant.
*
* This callback does not work with frame-based threads; threads must be disabled
* or sliced-threads enabled. This callback also does not work as one would expect
* with HRD -- since the buffering period SEI cannot be calculated until the frame
* is finished encoding, it will not be sent via this callback.
*
* Note also that the NALs are not necessarily returned in order when sliced threads is
* enabled. Accordingly, the variable i_first_mb and i_last_mb are available in
* x264_nal_t to help the calling application reorder the slices if necessary.
*
* When this callback is enabled, x264_encoder_encode does not return valid NALs;
* the calling application is expected to acquire all output NALs through the callback.
*
* It is generally sensible to combine this callback with a use of slice-max-mbs or
* slice-max-size.
*
* The opaque pointer is the opaque pointer from the input frame associated with this
* NAL unit. This helps distinguish between nalu_process calls from different sources,
* e.g. if doing multiple encodes in one process.
*/
void (*nalu_process)( x264_t *h, x264_nal_t *nal, void *opaque );
This seems straight forward enough. However, when I run the following dummy code, I get a segfault on the marked line. I've tried to add some debugging to x264_nal_encode
itself to understand where it goes wrong, but it seems to be the function call itself that results in a segfault. Am I missing something here? (Let's ignore the fact that the use of assert
probably makes cb
non-reentrant – it's only there to indicate to the reader that my workspace buffer is more than large enough.)
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <x264.h>
#define WS_SIZE 10000000
uint8_t * workspace;
void cb(x264_t * h, x264_nal_t * nal, void * opaque)
{
assert((nal->i_payload*3)/2 + 5 + 64 < WS_SIZE);
x264_nal_encode(h, workspace, nal); // Segfault here.
// Removed: Process nal.
}
int main(int argc, char ** argv)
{
uint8_t * fake_frame = malloc(1280*720*3);
memset(fake_frame, 0, 1280*720*3);
workspace = malloc(WS_SIZE);
x264_param_t param;
int status = x264_param_default_preset(¶m, "ultrafast", "zerolatency");
assert(status == 0);
param.i_csp = X264_CSP_RGB;
param.i_width = 1280;
param.i_height = 720;
param.i_threads = 1;
param.i_lookahead_threads = 1;
param.i_frame_total = 0;
param.i_fps_num = 30;
param.i_fps_den = 1;
param.i_slice_max_size = 1024;
param.b_annexb = 1;
param.nalu_process = cb;
status = x264_param_apply_profile(¶m, "high444");
assert(status == 0);
x264_t * h = x264_encoder_open(¶m);
assert(h);
x264_picture_t pic;
status = x264_picture_alloc(&pic, param.i_csp, param.i_width, param.i_height);
assert(pic.img.i_plane == 1);
x264_picture_t pic_out;
x264_nal_t * nal; // Not used. We process NALs in cb.
int i_nal;
for (int i = 0; i < 100; ++i)
{
pic.i_pts = i;
pic.img.plane[0] = fake_frame;
status = x264_encoder_encode(h, &nal, &i_nal, &pic, &pic_out);
}
x264_encoder_close(h);
x264_picture_clean(&pic);
free(workspace);
free(fake_frame);
return 0;
}
Edit: The segfault happens the first time cb
calls x264_nal_encode
. If I switch to a different preset, where more frames are encoded before the first callback happens, then several successful calls to x264_encoder_encode
are made before the first callback, and hence segfault, occurs.
After discussions with x264 developers on IRC, it seems that the behavior I was seeing is, in fact, a bug in x264. The x264_t * h
passed to the callback is incorrect. If one overrides that handle with the good one (the one obtained from x264_encoder_open
), things work fine.
I identified x264 git commit 71ed44c7312438fac7c5c5301e45522e57127db4 as the first bad one. The bug is documented as this x264 issue.
Update for future readers: I believe this issue has been fixed in commit 544c61f082194728d0391fb280a6e138ba320a96.