pythonnumpyimageopencvimage-processing

How to simplify the generation process of these boolean images?


I have written code that generates Thue-Morse sequence, its output is a NumPy 2D array containing only 0s and 1s, the height of the array is 2n and the width is n. More specifically, each intermediate result is kept as a column in the final output, with the elements repeated so that every column is of the same length.

For example, if the input is 3, we start with 01, we repeat the elements 4 times so that the first column is 00001111, in the next iteration we invert the bits and add to the previous iteration to get 0110, we repeat the elements twice so the second column is 00111100, and finally the last column is 01101001, so the output is:

array([[0, 0, 0],
       [0, 0, 1],
       [0, 1, 1],
       [0, 1, 0],
       [1, 1, 1],
       [1, 1, 0],
       [1, 0, 0],
       [1, 0, 1]], dtype=uint8)

Now the first kind of image I want to generate is simple, we repeat each column 2n - 1 - i times, so that each run of 1s becomes a rectangle, whose height is the length of the run, and in each subsequent column the rectangles are halved in width, so that sum of the width of the rectangles is height - 1.

This is the 7-bit ABBABAAB example of such image:

enter image description here

And the second kind of image I want to generate is to take the fractal squares and convert them to polar:

7-bit ABBABAAB fractal polar:

enter image description here


I wrote code to generate these images. But it is inefficient. It is easy to write working code, but it is hard to make it run fast.

Code

import cv2
import numpy as np
from enum import Enum
from PIL import Image
from typing import Generator, List, Literal


def validate(n: int) -> None:
    if not (n and isinstance(n, int)):
        raise ValueError(f"Argument {n} is not a valid bit width")


def abba_codes(n: int) -> np.ndarray:
    validate(n)
    power = 1 << (n - 1)
    abba = np.array([0, 1], dtype=np.uint8)
    powers = 1 << np.arange(n - 2, -1, -1)
    return (
        np.concatenate(
            [
                np.zeros(power, dtype=np.uint8),
                np.ones(power, dtype=np.uint8),
                *(
                    np.tile(
                        (abba := np.concatenate([abba, abba ^ 1])), (power, 1)
                    ).T.flatten()
                    for power in powers
                ),
            ]
        )
        .reshape((n, 1 << n))
        .T
    )


def abba_squares(n: int) -> np.ndarray:
    arr = abba_codes(n).T[::-1]
    powers = 1 << np.arange(n + 1)
    result = np.zeros((powers[-1],) * 2, dtype=np.uint8)
    for i, (a, b) in enumerate(zip(powers, powers[1:])):
        result[a:b] = arr[i]

    return (result.T[:, ::-1] ^ 1) * 255


def abba_square_img(n: int, length: int = 1024) -> np.ndarray:
    return Image.fromarray(abba_squares(n)).resize((length, length), resample=0)


def abba_polar(n: int, length: int = 1024) -> np.ndarray:
    square = np.array(abba_square_img(n, length))
    return cv2.warpPolar(square, (length, length), [half := length >> 1] * 2, half, 16)

Performance

In [2]: %timeit abba_square_img(10, 1024)
10.3 ms ± 715 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [3]: %timeit abba_polar(10, 1024)
27.2 ms ± 5.41 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

You see, I had to mix PIL and cv2, because only cv2 offers polar coordinate transformation, and only PIL lets me resize without any interpolation at all.

The following don't give intended result:

In [30]: cv2.imwrite('D:/garbage.png', cv2.resize(abba_squares(4), (1024, 1024), cv2.INTER_NEAREST))
Out[30]: True

In [31]: cv2.imwrite('D:/garbage1.png', cv2.resize(abba_squares(4), (1024, 1024), cv2.INTER_NEAREST_EXACT))
Out[31]: True

In [32]: cv2.imwrite('D:/garbage2.png', cv2.resize(abba_squares(4), (1024, 1024), cv2.INTER_AREA))
Out[32]: True

enter image description here

No matter what interpolation mode I try, it always gives a blurry image. I want the edges to be sharp and everything staying rectangles. So I had to use PIL to resize the images. Of course I can use a two level nested for loop to broadcast the pixels with result[i*n:i*n+n, j*n:j*n+n] = img[i, j] but that would be slow.

The edges of the polar images are jagged, not smooth, I want the edges to be smooth, and the corners are black, I want the corners white. And if I pass n larger than 14 to bool_squares it just hangs.

What are better ways to do these?


I have improved the code a little bit, but I still haven't figured out an efficient way to generate polar images directly. This question still deals with two kinds of images, that is because two things, 1, I still don't know how to efficiently fill rectangles in an array using a completely vectorized way, and 2, I still need to generate the fractal squares first in order to generate the polar image.

But I have made big progress on the generation of the fractal squares, so I found the consecutive runs of 1s and created many rectangles and used those rectangles to fill the array:

def find_transitions(line: np.ndarray) -> np.ndarray:
    if not line.size:
        return np.array([])

    return np.concatenate(
        [
            np.array([0] if line[0] else [], dtype=np.uint64),
            ((line[1:] != line[:-1]).nonzero()[0] + 1).astype(np.uint64),
            np.array([line.size] if line[-1] else [], dtype=np.uint64),
        ]
    )


def fractal_squares_helper(arr: np.ndarray, n: int, scale: int) -> List[np.ndarray]:
    x_starts = []
    x_ends = []
    y_starts = []
    y_ends = []
    widths = np.concatenate([[0], ((1 << np.arange(n - 1, -1, -1)) * scale).cumsum()])
    for i, (start, end) in enumerate(zip(widths, widths[1:])):
        line = find_transitions(arr[:, i]) * scale
        half = line.size >> 1
        y_starts.append(line[::2])
        y_ends.append(line[1::2])
        x_starts.append(np.tile([start], half))
        x_ends.append(np.tile([end], half))

    return [np.concatenate(i) for i in (x_starts, x_ends, y_starts, y_ends)]


def fill_rectangles(
    length: int,
    x_starts: np.ndarray,
    x_ends: np.ndarray,
    y_starts: np.ndarray,
    y_ends: np.ndarray,
) -> np.ndarray:
    img = np.full((length, length), 255, dtype=np.uint8)
    x = np.arange(length)
    y = x[:, None]
    mask = (
        (y >= y_starts[:, None, None])
        & (y < y_ends[:, None, None])
        & (x >= x_starts[:, None, None])
        & (x < x_ends[:, None, None])
    )
    img[mask.any(axis=0)] = 0
    return img


def fill_rectangles1(
    length: int,
    x_starts: np.ndarray,
    x_ends: np.ndarray,
    y_starts: np.ndarray,
    y_ends: np.ndarray,
) -> np.ndarray:
    img = np.full((length, length), 255, dtype=np.uint8)
    for x0, x1, y0, y1 in zip(x_starts, x_ends, y_starts, y_ends):
        img[y0:y1, x0:x1] = 0

    return img


def fractal_squares(n: int, length: int, func: bool) -> np.ndarray:
    arr = abba_codes(n)
    scale, mod = divmod(length, total := 1 << n)
    if mod:
        raise ValueError(
            f"argument length: {length} is not a positive multiple of {total} n-bit codes"
        )

    return (fill_rectangles, fill_rectangles1)[func](
        length, *fractal_squares_helper(arr, n, scale)
    )
In [4]: %timeit fractal_squares(10, 1024, 0)
590 ms ± 19.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: %timeit fractal_squares(10, 1024, 1)
1.65 ms ± 56.6 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

The first method I used to fill the rectangles is completely vectorized, but it is very slow and memory consuming, it is the only way I could make it work.

The for loop based method is much faster, but it isn't completely vectorized, I want to vectorize it completely, to do away with the loop.

Now the polar images can be generated similarly, instead of filling Cartesian rectangles, we fill "polar rectangles", I have calculated the coordinates, but I cannot fill the rectangles:

def rectangularize(y: np.ndarray, x: np.ndarray) -> np.ndarray:
    l = y.shape[0]
    h = l // 2
    return np.stack([np.tile(y, (2, 1)).T.flatten(), np.tile(x, l)]).T.reshape(
        (h, 4, 2)
    )


def radial_rectangles(n: int, length: int) -> np.ndarray:
    arr = abba_codes(n)
    radii = np.concatenate([[0], (length >> np.arange(1, n + 1)).cumsum()])
    rectangles = []
    total = 1 << n
    tau = 2 * np.pi
    for i, (start, end) in enumerate(zip(radii, radii[1:])):
        line = find_transitions(arr[:, i]) / total * tau
        rectangles.append(rectangularize(line, [start, end]))

    return np.concatenate(rectangles)
In [6]: radial_rectangles(4, 1024)
Out[6]:
array([[[3.14159265e+00, 0.00000000e+00],
        [3.14159265e+00, 5.12000000e+02],
        [6.28318531e+00, 0.00000000e+00],
        [6.28318531e+00, 5.12000000e+02]],

       [[1.57079633e+00, 5.12000000e+02],
        [1.57079633e+00, 7.68000000e+02],
        [4.71238898e+00, 5.12000000e+02],
        [4.71238898e+00, 7.68000000e+02]],

       [[7.85398163e-01, 7.68000000e+02],
        [7.85398163e-01, 8.96000000e+02],
        [2.35619449e+00, 7.68000000e+02],
        [2.35619449e+00, 8.96000000e+02]],

       [[3.14159265e+00, 7.68000000e+02],
        [3.14159265e+00, 8.96000000e+02],
        [3.92699082e+00, 7.68000000e+02],
        [3.92699082e+00, 8.96000000e+02]],

       [[5.49778714e+00, 7.68000000e+02],
        [5.49778714e+00, 8.96000000e+02],
        [6.28318531e+00, 7.68000000e+02],
        [6.28318531e+00, 8.96000000e+02]],

       [[3.92699082e-01, 8.96000000e+02],
        [3.92699082e-01, 9.60000000e+02],
        [1.17809725e+00, 8.96000000e+02],
        [1.17809725e+00, 9.60000000e+02]],

       [[1.57079633e+00, 8.96000000e+02],
        [1.57079633e+00, 9.60000000e+02],
        [1.96349541e+00, 8.96000000e+02],
        [1.96349541e+00, 9.60000000e+02]],

       [[2.74889357e+00, 8.96000000e+02],
        [2.74889357e+00, 9.60000000e+02],
        [3.53429174e+00, 8.96000000e+02],
        [3.53429174e+00, 9.60000000e+02]],

       [[4.31968990e+00, 8.96000000e+02],
        [4.31968990e+00, 9.60000000e+02],
        [4.71238898e+00, 8.96000000e+02],
        [4.71238898e+00, 9.60000000e+02]],

       [[5.10508806e+00, 8.96000000e+02],
        [5.10508806e+00, 9.60000000e+02],
        [5.89048623e+00, 8.96000000e+02],
        [5.89048623e+00, 9.60000000e+02]]])

The output is of the shape (n, 4, 2), each (4, 2) shape is a "radial rectangle", the first element of the innermost pairs is the angle from x-axis measured in radians, the second is the radius. The "radial rectangles" are in the format [(a0, r0), (a0, r1), (a1, r0), (a1, r1)]

What is a more efficient way to fill rectangles and how can I fill "radial rectangles"?


Solution

  • Think in shaders.

    First, have a function that decides the bit in the sequence.

    @nb.njit
    def thue_morse(level: int, alpha: float) -> bool:
        assert level >= 0
        assert 0 <= alpha < 1
    
        value = False
    
        while level > 0:
            level -= 1
            if alpha < 0.5:
                alpha *= 2
            else:
                alpha = alpha * 2 - 1
                value = not value
    
        return value
    

    You can test that:

    for level in range(0, 4):
        print(f"level {level}")
        nsteps = 2 ** (level+0) # +1 for a level of oversampling
        alphas = (np.arange(nsteps) + 0.5) / nsteps
        values = np.array([
            thue_morse(level, alpha)
            for alpha in alphas
        ])
        print(np.vstack([alphas, values]))
        print()
    

    Output looks right:

    level 0
    [[0.5]
     [0. ]]
    
    level 1
    [[0.25 0.75]
     [0.   1.  ]]
    
    level 2
    [[0.125 0.375 0.625 0.875]
     [0.    1.    1.    0.   ]]
    
    level 3
    [[0.0625 0.1875 0.3125 0.4375 0.5625 0.6875 0.8125 0.9375]
     [0.     1.     1.     0.     1.     0.     0.     1.    ]]
    

    Now have a function that decides the color for any position u,v on the texture.

    Apply it to all the pixels.

    @nb.njit(parallel=True, cache=True)
    def shade_rectangular(width: int, height: int):
        canvas = np.ones((height, width), dtype=np.bool)
    
        for y in nb.prange(height):
            for x in range(width):
                alpha = np.float32(y) / height
    
                # log2 distribution: level 0 takes half the width, 1 a quarter, etc
                # and start at level 1, so the left half isn't totally trivial
                level = int(-np.log2((width-x) / width)) + 1
    
                canvas[y,x] = not thue_morse(level, alpha)
    
        return canvas
    
    canvas = shade_rectangular(2048, 2048)
    

    rectangular

    For the polar thing, convert coordinates to polar, then sample:

    @nb.njit(parallel=True, cache=True)
    def shade_polar(radius: int):
        width = height = 2*radius + 1
        canvas = np.ones((height, width), dtype=np.bool)
    
        for y in nb.prange(height):
            yy = y - radius
            for x in range(width):
                xx = x - radius
    
                r = np.hypot(xx, yy)
                theta = np.arctan2(yy, xx)
    
                if r < radius:
                    level = int(-np.log2((radius-r) / radius)) + 1
                    alpha = (theta / (2 * np.pi)) % 1
                    canvas[y,x] = not thue_morse(level, alpha)
    
        return canvas
    
    canvas = shade_polar(1024)
    

    polar

    If you want supersampling, you can have supersampling.

    canvas = shade_polar(4096) * np.uint8(255)
    canvas = cv.pyrDown(cv.pyrDown(canvas))
    

    enter image description here


    I didn't time any of this. Except for the huge images, it's negligible.

    I didn't bother making sure all the floats are fp32. Some of them might be fp64, which costs performance.

    The trig functions for the polar plot cost a bunch in particular. You could calculate these values once, store them, then reuse.