androidimage-processingrustandroid-camera

Attempt at YU12 to YUYV conversion resulting in noisy image


I'm attempting to use the Rust library nokhwa to capture images using my Android phone camera. I've connected my Android phone via USB to my laptop and am using Droidcam to have it work as a webcam.

My phone camera sends in the image in the YU12 format which the library doesn't natively support. I'm attempting to add support for it by converting the byte stream from YU12 to YUYV.

This is the function that attempts to perform the conversion:

pub fn yu12_to_yuyv(resolution: Resolution, data: &[u8]) -> Vec<u8> {
    // Calculate the sizes of the Y, U, and V planes
    let width = resolution.width_x as usize;
    let height = resolution.height_y as usize;
    let size = width * height;
    let u_size = (width / 2) * (height / 2);
    let v_size = u_size;
    
    // Extract Y, U, and V planes from the input data
    let y_plane = &data[0..size];
    let u_plane = &data[size..(size + u_size)];
    let v_plane = &data[(size + u_size)..(size + u_size + v_size)];
    
    // Create a vector to hold the YUYV data
    let mut yuyv = Vec::with_capacity(size * 2);
    
    // Iterate over the image in 2x1 pixel blocks
    for y in 0..height {
        for x in (0..width).step_by(2) {
            // Calculate positions in the Y, U, and V planes
            let y_index1 = y * width + x;
            let y_index2 = y * width + x + 1;
            let uv_index = (y / 2) * (width / 2) + (x / 2);
            
            // Read Y, U, and V values
            let y1 = y_plane[y_index1];
            let y2 = y_plane[y_index2];
            let u = u_plane[uv_index];
            let v = v_plane[uv_index];
            
            // Append YUYV data (2 pixels)
            yuyv.push(y1);
            yuyv.push(u);
            yuyv.push(y2);
            yuyv.push(v);
        }
    }
    
    yuyv
}

However, the resultant image looks like this: enter image description here

This looks mostly correct of course except for what appears to be noise. What could be going wrong here given that the majority of the image looks fine? For more context, the width is 640, height is 480 and data length is 462848. I will note that the data length is a little odd as it's expected to be 460800 as stated when I run v4l2-ctl -d /dev/video0 --all, so I'm not sure where the extra 2048 bytes are coming from. I thought it might be padding of some sort, but I'm not 100% sure.


Solution

  • In the buf_yuyv422_to_rgb() function of nokhwa-core, we see this conversion

            let r0 = y0 + 1.370_705 * (v - 128.);
            let g0 = y0 - 0.698_001 * (v - 128.) - 0.337_633 * (u - 128.);
            let b0 = y0 + 1.732_446 * (u - 128.);
    

    and the resulting RBG components are directly stored like this

    [r0 as u8, g0 as u8, b0 as u8, ...]
    

    Playing with full range values (0..=255) of y0, u and v leads to some r0, g0 and b0 values falling outside the 0..=255 range, then the as u8 coercion are totally incorrect (wrapping around).

    On the other hand, the yuyv444_to_rgb() function (just below in the same file) uses appropriate .clamp(0, 255) operations to prevent this (although the formulas are different — integer approximations).

    Maybe should you try to compute yourself the RGB components, applying the appropriate clamping, then provide directly an RGB buffer instead of YUYV?