From astropy I get 2D numpy.ndarray shape of data: (2016, 3040) It is a 6Mpx Array I want to search for defects in the 3040 columns.
My definition for a column error is when
500 times in column n
value of "cell" (m,n) is 50 units smaller than the cell m,n-2 AND m,n+2 with m 0.... 2016
I count the occurence in result_array when iterating over the rows m
It works,erors are identified correctly- but it is slow. As I want to process 20-40 images and later correct the column defect. So time is an issue.
Before this brute force approach I was experimenting with the column.mean() function. This was not suitable to detect bad columns. Still I do not check if the defect occurs from m,m+1, m+2 ... consecutively. Just counting and assuming that a column error is when appr. 25% of the pixels in one column show sigificantly lower values (here 50) than the neighboring pixels.
There is a tool named fixfits , created by Sander Pool. This tool, by Sander Pool is not available anymore. I fear sander Pool has passed away. With the coming version of Windows you never know if it will work on future versions.
Some ideas how to speed up the processinge.g. with numpy are highly appreciated.
This is the data structure
classification of data: <class 'numpy.ndarray'> shape of data: (2016, 3040) one row as example: [ 0 1446 1402 ... 1347 1421 0] shape of row: (3040,)
Here my Python code
import numpy as np
row_index =0
col_index =0
row_max = 2016
col_max = 3040
threshold_intensity = 50
result_array= np.zeros(col_max)
for x in range(2,col_max-2):
for y in range( 1, row_max-1):
compare = data[y,x] + 50 #threshold_intensity
if ((compare < data[y,x-2]) and (compare < data[y,x+2])):
result_array[x] = result_array[x]+1
if result_array[x] >500 :
print("Index: "+ str(x))
break
for i in range(1,500):
print (str(i)+" :" + str(result_array[i]))
Studying astroy, numpy ans Python forums
import numpy as np
#testing operation with real image array
ā
source =data # ndarray from fits-file
source_offset = source +50 # add permitted tolerance, keep original data
compare_right=np.roll(source, -2, axis=1) # shift 2cols left
compare_left=np.roll(source, 2, axis=1) # shift 2cols right
averageArray = compare_right + compare_left # building averaged values for later correction
averageArray= averageArray//2 # integer division sufficiently accurate for later use
comresright = source_offset < compare_right # reduce to boolean values
comresleft = source_offset < compare_left
comresleft = comresleft & comresright # True only if TRUE in both compare array
count= np.count_nonzero(comresleft, axis =0)# number of "TRUEs" per column
count[count<=500]=0 # set to 0 if 500 TRUEs or less -> col is o.k.
count[count>0]=1 # set to 1 for adding to an accumulator for summary overview
print(count.shape)
print ('\nindices where elements are = 1:')
indexes = np.where(count>0)[0]
print (indexes) # indexes contains integers only
Output is (3040,) which is the width of the array.
indices where elements are == 1: [ 0 184 451 3039]
This is much faster than the loops. 1st column and last column will be discarded for correction. The indicated columns a truly defects in the CCD sensor. The identification works. Now it goes for the batch operation. Parsing all images in a directory and do the correction (here in col 184 and 451). The columns will be replaced by the corresponding columns in averageArray.
ā