In Halide, is there a way to split up an input image into 2x2 quartets of pixels and implement a unique computation in each pixel of the quartet?
For example, I want to implement the following computations for each pixel in the quartet:
Upper left: (x + 1, y) + (x - 1, y) + (x, y + 1) + (x, y - 1)
Upper right: (x + 1, y) + (x - 1, y)
Lower left: (x, y + 1) + (x, y - 1)
Lower right: (x - 1, y - 1) + (x + 1, y - 1) + (x - 1, y + 1) + (x + 1, y + 1)
And I want this computational pattern to extend across the entire input image.
There are a number of ways to do this. You can do it perhaps most directly using a select
on x%2==0
and y%2==0
. Something like:
// Sub-terms
Expr ul = in(x+1,y) + in(x-1,y) + in(x,y+1) + in(x,y-1);
Expr ur = in(x+1,y) + in(x-1,y);
Expr ll = in(x,y+1) + in(x,y-1);
Expr ul = in(x-1,y-1) + in(x+1,y-1) + in(x-1,y+1) + in(x+1,y+1);
Expr ix = x%2==0;
Expr iy = y%2==0;
out(x,y) = select(iy,
select(ix, ul, ur),
select(ix, ll, lr));
(There’s also a multi-condition version of select
into which you could pack this.)
If you then unroll
the x
and y
dimensions of out
each by 2, you'll get a tight loop over quartets with no control flow:
out.unroll(x,2).unroll(y,2);
This is quite similar to the patterns you see in a demosaicing algorithm, of which you can find one here in the official Halide reference apps. Inspired by that, you may also find it natural to pack your data from 2D into 3D, with the 3rd dimension being the 4 elements of a quartet:
packed(x,y,c) = in(x+c%2, y+c/2);
which you may find easier to work with in some cases.