pythonnumpy-ndarrayastropy

2D numpy.ndarray comparing column elements to 2 neighboring elements in a row speed up?


From astropy I get 2D numpy.ndarray shape of data: (2016, 3040) It is a 6Mpx Array I want to search for defects in the 3040 columns.

My definition for a column error is when 500 times in column n
value of "cell" (m,n) is 50 units smaller than the cell m,n-2 AND m,n+2 with m 0.... 2016 I count the occurence in result_array when iterating over the rows m

It works,erors are identified correctly- but it is slow. As I want to process 20-40 images and later correct the column defect. So time is an issue.

Before this brute force approach I was experimenting with the column.mean() function. This was not suitable to detect bad columns. Still I do not check if the defect occurs from m,m+1, m+2 ... consecutively. Just counting and assuming that a column error is when appr. 25% of the pixels in one column show sigificantly lower values (here 50) than the neighboring pixels.

There is a tool named fixfits , created by Sander Pool. This tool, by Sander Pool is not available anymore. I fear sander Pool has passed away. With the coming version of Windows you never know if it will work on future versions.

Some ideas how to speed up the processinge.g. with numpy are highly appreciated.

This is the data structure

classification of data: <class 'numpy.ndarray'> shape of data: (2016, 3040) one row as example: [ 0 1446 1402 ... 1347 1421 0] shape of row: (3040,)

Here my Python code

import numpy as np
row_index =0
col_index =0
row_max = 2016
col_max = 3040
threshold_intensity = 50
result_array= np.zeros(col_max)

for x in range(2,col_max-2):
    for y in range( 1, row_max-1):
        compare =  data[y,x] + 50 #threshold_intensity
        if ((compare < data[y,x-2]) and (compare < data[y,x+2])):
            result_array[x] = result_array[x]+1
            if result_array[x] >500 :
                print("Index: "+ str(x))
                break

            
for i in range(1,500):
    print (str(i)+"  :" + str(result_array[i]))

Studying astroy, numpy ans Python forums


Solution

  • import numpy as np
    
    #testing operation with real image array
     ā€‹
    
    source =data               # ndarray from fits-file
    source_offset = source +50 # add permitted tolerance, keep original data
    
    compare_right=np.roll(source, -2, axis=1)  # shift 2cols left
    compare_left=np.roll(source, 2, axis=1)    # shift 2cols right
    
    averageArray = compare_right + compare_left # building averaged values for later correction
    averageArray= averageArray//2 # integer division sufficiently accurate for later use
    
    comresright = source_offset < compare_right # reduce to boolean values
    comresleft  = source_offset < compare_left 
    comresleft  = comresleft & comresright # True only if TRUE in both compare array 
    
    count= np.count_nonzero(comresleft, axis =0)# number of "TRUEs" per column
    
    count[count<=500]=0 # set to 0 if 500 TRUEs or less -> col is o.k.
    count[count>0]=1 # set to 1 for adding to an accumulator for summary overview
    
    print(count.shape)
    print ('\nindices where elements are = 1:')
    indexes = np.where(count>0)[0]
    print (indexes)                             # indexes contains integers only
    

    Output is (3040,) which is the width of the array.

    indices where elements are == 1: [ 0 184 451 3039]

    This is much faster than the loops. 1st column and last column will be discarded for correction. The indicated columns a truly defects in the CCD sensor. The identification works. Now it goes for the batch operation. Parsing all images in a directory and do the correction (here in col 184 and 451). The columns will be replaced by the corresponding columns in averageArray.

    ā€‹