I need to send video from a Kinect camera through a network. I'm capturing video from the following two Kinect sources:
This amounts to a bandwidth of at least roughly 53 MB/s. That is why I need to encode (compress) both video sources at origin and then decode at target. The RGB-D data will be processed by an object tracking algorithm at target.
So far I've found many papers discussing algorithms to achieve this task, like, for instance, this one: RGB and depth intra-frame Cross-Compression for low bandwidth 3D video
The problem is that the algorithms described in such papers do not have a public access implementation. I know, I could implement them myself, but they make use of many other complex image processing algorithms I do not have a sufficient knowledge about (edge detection, contour characterization, ...).
I actually also found some C++ libraries based on the use of a Discrete median filter, delta (avoid sending redundant data), and LZ4 compression: http://thebytekitchen.com/2014/03/24/data-compression-for-the-kinect/
My question is: is there a simpler and/or more efficient way of compressing RGB-D data from a Kinect source?
PS: I'm coding in C++.
In a recent search on the problem I found a paper that describes compressing depth images using the h264 video codec. The authors also provide basic software:
A problem is that h264 can introduce compression artifacts. To minimize the errors introduced by the codec the depth image is split into multiple channels which represent different ranges of distances.