pythonnumpyopencvvips

how to rotate and pad huge numpy arrays without consuming too much memory?


I'm working with big wsi images, while making wsi I'm using opencv for calculations and appending frames into a huge numpy array(50kx50kx3 array taking 7.5gb memory), for saving I was using opencv imwrite function but when I was trying to open those saved tiff files with openslide it throws "unsupported or missing image file" error.

I found out opencv doesn't save images as pyramids, then I tried to save images as pyramids using pyvips, before save I need to rotate image because camera is rotated for faster scanning, but when I do rotate with opencv it doubles amount of memory consumed by program, I wonder if there is another method to rotate without huge memory usage, this also happens when I reach edges of image and therefore need to pad it to not overflow, but it goes down after doing pad and just takes double memory at first, is there optimal method to do the pad and rotate? here's where I pad array:

if np.shape(test_matrix)[1] - (cm2 + int(y_edge_on_wsi) + y_in) <= 2000:
    test_matrix = np.pad(test_matrix, ((0, 0), (0, 10000), (0, 0)), mode='constant', constant_values=255)
                

if np.shape(test_matrix)[0] - (cm1 + int(x_edge_on_wsi) + x_in) <= 2000:
                        test_matrix = np.pad(test_matrix, ((0, 10000), (0, 0), (0, 0)), mode='constant', constant_values=255)
                

# Add rows or columns to The beginning of the test_matrix
if cm2 + int(y_edge_on_wsi) <= 2000:
    test_matrix = np.pad(test_matrix, ((0, 0), (10000, 0), (0, 0)), mode='constant', constant_values=255)
    cm2 = cm2 + 10000
if cm1 + int(x_edge_on_wsi) <= 2000:
    test_matrix = np.pad(test_matrix, ((10000, 0), (0, 0), (0, 0)), mode='constant', constant_values=255)
    cm1 = cm1 + 10000

and here's where I do rotate before saving image:

self.image = cv2.rotate(self.image, cv2.ROTATE_90_CLOCKWISE)
cv2.imwrite(self.fullSavePathTiff, self.image)

here's memory usage before rotate: memory usage before rotate

and this is memory usage after rotate: memory usage after rotate


Solution

  • You can try and crop your images into smaller chunks to the exact region of interest. There is a library called openseadragon which helps to point the exact location of the region of interest(

    openseadragon

    Processing the image in smaller chunks can help manage memory usage more efficiently. You can use this code snippet to guide

    import cv2
    import numpy as np
    from tifffile import imwrite
    
    # load the library
    chunk_size = 1000
    image_height, image_width = 50000, 50000  # Example dimensions
    
    def rotate_and_pad_chunk(chunk):
        # Rotate the chunk
        rotated_chunk = cv2.rotate(chunk, cv2.ROTATE_90_CLOCKWISE)
        # Pad the chunk
        pad_width = size_of_image/100 "you would have to check the size of image"
        pad_height = size_of_image/100 "you would have to check the size"
        padded_chunk = cv2.copyMakeBorder(rotated_chunk, pad_height, pad_height, pad_width, pad_width, cv2.BORDER_CONSTANT, value=[0, 0, 0])
        return padded_chunk
    
    output_image = np.zeros((image_height, image_width, 3), dtype=np.uint8)  # Adjust as necessary
    
    for y in range(0, image_height, chunk_size):
        for x in range(0, image_width, chunk_size):
            chunk = np.zeros((chunk_size, chunk_size, 3), dtype=np.uint8)  # Replace with actual chunk loading logic
            processed_chunk = rotate_and_pad_chunk(chunk)
            # Place the processed chunk back in the output image
            output_image[y:y + processed_chunk.shape[0], x:x + processed_chunk.shape[1], :] = processed_chunk
    
    # Save the processed image
    imwrite('output_image.tiff', output_image)