pythonalgorithmnumpyperformanceflood-fill

Speed up flood-fill algorithm to determine inundated areas in a raster image


I have a raster image that contains elevations (DEM). I also have some points with specific elevations associated (something like Point(x, y, h) where h is an elevation). I'm trying to, starting from the location of the given point, find all cells CONNECTED in 8 directions so that their elevation is under the elevation of the given point (i.e under h).

As an example, lets say we have the image M and a Point(2,2,2.1). Then, I would like to get the result from matrix MS.

M = np.array([
    [2, 6, 6, 6, 6],
    [6, 2, 6, 6, 6],
    [6, 6, 2, 6, 6],
    [6, 6, 6, 6, 6],
    [6, 6, 6, 6, 6]
])

MS = np.array([
    [True, False, False, False, False],
    [False, True, False, False, False],
    [False, False, True, False, False],
    [False, False, False, False, False],
    [False, False, False, False, False]
], dtype=bool)

For this I implemented a modified version of the flood-fill algorithm:

def flood_fill(image, x, y, threshold):
    """
    
    Calculates inundated cells. Cells get inundated if the threshold value is higher
    than the starting point elevation.

    Parameters
    ----------
    image: numpy.ndarray
        array of elevations
    cx : int
        starting coordinate x in pixel coordinates
    cy : int
        starting coordinate y in pixel coordinates
    threshold : float
        threshold elevation.

    Returns
    -------
    filled : numpy.ndarray
        array with boolean values where True means the cells is flooded

    """
    
    filled = np.zeros_like(image, dtype=bool)
    toFill = []
    toFill.append((x,y))
    
    while not len(toFill) == 0:
        (x,y) = toFill.pop()
        
        if x < 0 or y < 0 or x >= image.shape[0] or y >= image.shape[1]:
            continue
        
        if filled[x, y] or image[x, y] > threshold:
            continue
        
        filled[x, y] = 1
        toFill.append((x - 1, y))
        toFill.append((x + 1, y))
        toFill.append((x, y - 1))
        toFill.append((x, y + 1))
        toFill.append((x-1, y - 1))
        toFill.append((x-1, y + 1))
        toFill.append((x+1, y - 1))
        toFill.append((x+1, y + 1))
        
    return filled

While it works, is kind of slow for the purposes I'm using it. I also tried with a recursive version, but it explodes when the image is bigger than a few thousand pixels (stack overflow error). Is there a way to make it faster? Or do you know a python library that implements something like this?

As an example for an image of 5000×3000 is the method is taking 15 seconds, which is too much because I have several points.

I uploaded the raster to google drive since it is heavy: RASTER_DEM


Solution

  • Just use the flood function from skimage.

    This takes around 0.15 seconds per starting point for a 5000x3000 image on my machine, plus reading and writing the file.

    from skimage.segmentation import flood
    from skimage.io import imread, imsave
    
    # Load your image; doesn't really matter how as long as it's an ndarray.
    # Note that this assumes an rgb image.
    im = imread("path")
    
    # Some starting point from which to flood
    x, y = 351, 137
    
    # Get the elevation by just grabbing the first color channel at `x, y`
    elevation = im[y, x, 0]
    
    # Create a boolean ndarray that's True where the image is equal or less than
    # the starting point, and False everywhere else. We only look at the first
    # color channel because the image is grayscale, hence the "[:,:,0]".
    is_lower = im[:,:,0] <= elevation
    
    # Flood fill using skimage.segmentation.flood
    mask = flood(is_lower, (y, x))
    
    # Now we can modify all the pixels in the image based
    # on the floodfill mask however we want using the mask
    im[mask] = elevation
    
    # Save the image
    imsave("path", im)