I want to convert a 12bit image signal to HEVC for effective compression. Because I need to be able to reconstruct the original 12bit signal, the compression needs to be losslessly reversible. At the moment I have the data as 16-bit PNG files.
My first try was using ffmpeg:
ffmpeg -y -framerate 1 -i input.png -c:v libx265 -x265-params "lossless=1" output.mp4
Unfortunately the output is not reversible. When extracting the image from the mp4, the pixel values are slightly off.
ffmpeg -i output.mp4 -vframes 1 reconstructed.png
Following Answer suggest converting the input to YUV444 first to avoid unexpected behavior by ffmpeg: Lossless x264 compression
I have failed so far to successfully convert my 16bit file to YUV, convert it to x256 and receive a correct reconstruction when decoding.
Is there a straight forward way to convert 16bit images to HEVC?
I found a solution with minor rounding errors:
Encoding:
Based on the following post: How to render png's as h.265 12 bit video?
Use can use the following codec parameters: -x265-params lossless=1 -pix_fmt yuv444p12le
for lossy 12 bpc encoding.
By trial and error, I realized that the 12 bits data must be in the upper 12 bits of each 16 bits element.
You need to scale up the input pixels by 16 for placing the data in the upper bits.
(Scaling by 16 is equivalent to left shifting the uint16 elements by 4).
For scaling pixels up you can use colorlevels
video filter:
-vf colorlevels=rimax=0.0625:gimax=0.0625:bimax=0.0625
The following command encodes a single frame:
ffmpeg -i input.png -vf colorlevels=rimax=0.0625:gimax=0.0625:bimax=0.0625 -c:v libx265 -x265-params lossless=1 -pix_fmt yuv444p12le output.mkv
Decoding:
colorlevels
, so I used curves
filter:-vf "curves=r='0/0 1.0/0.0625':g='0/0 1.0/0.0625':b='0/0 1.0/0.0625'"
rgb48be
.The following command decodes a single frame (and divide by 16):
ffmpeg -i output.mkv -vf "curves=r='0/0 1.0/0.0625':g='0/0 1.0/0.0625':b='0/0 1.0/0.0625'" -pix_fmt rgb48be reconstructed.png
Differences:
The maximum absolute difference between input.png
and reconstructed.png
is 4
levels.
The reason for the difference is probably rounding errors caused by converting RGB to YUV and back.
I used the following MATLAB code for testing:
I = imread('peppers.png');
% Build 10 PNG images (used as input).
for i = 1:10
J = insertText(I, [size(I,2)/2-18, size(I,1)/2-36], num2str(i), 'FontSize', 72);
J = imnoise(im2double(J), 'gaussian', 0, 0.01); % Add some noise
J = uint16(round(J*4095)); % Convert to 12 bits range (range [0, 4095])
imwrite(J, sprintf('input%02d.png', i), 'fmt', 'png', 'BitDepth', 16, 'Mode', 'lossless'); % Write to PNG file
end
%Encode video file using x265 codec, and 12 bits YUV444 format.
[status, cmdout] = system('ffmpeg -y -i input%02d.png -vf colorlevels=rimax=0.0625:gimax=0.0625:bimax=0.0625 -c:v libx265 -x265-params lossless=1 -pix_fmt yuv444p12le output.mkv');
if (status ~= 0), disp(cmdout);end
% Decode output.mkv into 10 PNG image files
[status, cmdout] = system('ffmpeg -y -i output.mkv -vf "curves=r=''0/0 1.0/0.0625'':g=''0/0 1.0/0.0625'':b=''0/0 1.0/0.0625''" -pix_fmt rgb48be reconstructed%02d.png');
if (status ~= 0), disp(cmdout);end
% Compare input and output:
for i = 1:10
I = imread(sprintf('input%02d.png', i));
J = imread(sprintf('reconstructed%02d.png', i));
max_abs_diff = max(max(max(imabsdiff(I, J))));
disp(['max_abs_diff = ', num2str(max_abs_diff)]);
end
Working with Grayscale format:
When working Grayscale, you don't need to convert the pixel format to YUV.
Converting from Grayscale to YUV444 multiplies the size of input data by 3, so it's better to avoid the conversion.
The following command encodes a single Grayscale frame:
ffmpeg -i input.png -vf "curves=all='0/0 0.0625/1.0'" -c:v libx265 -x265-params lossless=1 -pix_fmt gray12le -bsf:v hevc_metadata=video_full_range_flag=1 output.mkv
The following command decodes a single Grayscale frame (and divide by 16):
ffmpeg -i output.mkv -vf "curves=all='0/0 1.0/0.0625'" -pix_fmt gray16be reconstructed.png
The maximum absolute difference is 2.
Note about using -bsf:v hevc_metadata=video_full_range_flag=1
:
In H.265, the default range of Y color channel is "limited range".
For 8 bits the "limited range" applies [16, 235].
For 12 bits the "limited range" applies [256, 3760].
When using "full range" [0, 255] for 8 bits or [0, 4095] for 12 bits, you need to specify it in the stream's Metadata.
The way do set the Metadata with FFmpeg is using a bitstream filter.