I have some high dimensional boolean data, in this example an array with 4 dimensions, but this is arbitrary:
X.shape
(3, 2, 66, 241)
I want to group the dataset into connected regions of True values, which can be done with scipy.ndimage.label, with the aid of a connectivity structure which says which points in the array should be considered to touch. The default 2-D structure is a cross:
[[0,1,0],
[1,1,1],
[0,1,0]]
Which can be easily extended to high dimensions if all those dimensions are connected. However I want to programmatically generate such a structure where I have a list of which dims are connected to which:
#We want to find connections across dims 2 and 3 across each slice of dims 0 and 1:
dim_connections=[[0],[1],[2,3]]
#Now we want two separate connected subspaces in our data:
dim_connections=[[0,1],[2,3]]
For individual cases I can work out with hard-thinking how to generate the correct structuring element, but I am struggling to work out the general rule! For clarity I want something like:
mystructure=construct_arbitrary_structure(ndim, dim_connections)
the_correct_result=scipy.ndimage.label(X,structure=my_structure)
This should work for you
def construct_arbitrary_structure(ndim, dim_connections):
#Create structure array
structure = np.zeros([3] * ndim, dtype=int)
#Fill structure array
for d in dim_connections:
if len(d) > 1:
# Set the connection between multiple dimensions
for i in range(ndim):
# Create a unit vector
u = np.zeros(ndim, dtype=int)
u[i] = 1
# Create a mask by adding the connection between multiple dimensions
M = np.zeros([3] * ndim, dtype=int)
for j in d:
M += np.roll(u, j)
structure += M
else:
# Set the connection for one dimension
u = np.zeros(ndim, dtype=int)
u[d[0]] = 1
structure += u
#Make sure it's symmetric
for i in range(ndim):
structure += np.roll(structure, 1, axis=i)
return structure