I'm trying to align a rectangular prism contained in a 3D numpy
array to the principal axes of the array. Specifically, I need to accomplish this alignment by rotating the full array, as opposed to extracting the coordinates of the object and then rotating these points (this is because a perfect segmentation of the object from the background is unrealistic for my application).
I have a method that works under very specific circumstances, but it seems surprisingly sensitive given how general the approach is. In short, my method only works for very specific object orientations.
I'm looking for guidance on either (a) what's wrong with my approach or (b) a different approach that accomplishes the same objectives.
My approach (inspired by this post and this post):
from scipy.ndimage import affine_transform
from skimage import measure
def find_orientation(prism):
# Calculate second-order moments
M = measure.moments_central(prism, order=2)
# Constructing the covariance matrix from the central moments
cov_matrix = np.array(
[
[M[2, 0, 0], M[1, 1, 0], M[1, 0, 1]],
[M[1, 1, 0], M[0, 2, 0], M[0, 1, 1]],
[M[1, 0, 1], M[0, 1, 1], M[0, 0, 2]],
]
)
# Compute eigenvectors from the covariance matrix
eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)
return eigenvectors
def rotate_to_align(prism):
# Calculate orientation matrix using moments method
orientation_matrix = find_orientation(prism)
# Rotate the prism using the orientation matrix
aligned_prism = affine_transform(prism, orientation_matrix.T, order=3)
return aligned_prism
aligned_array = rotate_to_align(misaligned_array)
Please see this notebook for the full details (too long for SO): https://github.com/petermattia/3d-alignment/blob/main/prism_alignment.ipynb
Related but distinct questions/resources:
Thank you!
In your notebook, you simply rotate around the origin. For severe rotations, that can move your content our of the domain you look at.
In your transformation, introduce translations from and to the center.
Ok so you have a volume, containing a solid box, oriented in some way. You were looking for the axes of that object.
Calculating axes from moments appears to be unstable. The longest axis through a box may be a diagonal. That does not appear to be what you want.
I propose working with the surface normals of each face of the object.
You can get those from clustering the gradients of the volume. That'll give you six stable clusters of surface normals. I've chosen Sobel because the included lowpass is more stable than straight np.gradient()
.
Pair up opposite vectors. Sort them sensibly to reduce surprising rotations. Then apply some more linear algebra to ensure you actually have vectors that are normal to each other. That need not be the case if the box happens to be a little sheared or just from errors due to finite arithmetic.
This is not perfect. Interpolation/sampling and finite arithmetic are issues.
# imports
import numpy as np
from numpy.linalg import norm, inv
import cv2 as cv
import matplotlib.pyplot as plt
import scipy.ndimage as ndi
# first bunch of utility functions
def normalize(vec):
vec = np.asarray(vec)
return vec / norm(vec)
def translate3(tx=0, ty=0, tz=0):
T = np.eye(4)
T[:3, 3] = tx, ty, tz
return T
def rotate3_axis(axis, radians=None, degrees=None):
axis = np.asarray(axis, dtype=np.float64)
assert (radians is not None) ^ (degrees is not None)
if degrees is not None:
radians = np.deg2rad(degrees)
if radians is not None:
axis = normalize(axis) * radians
R, jac = cv.Rodrigues(axis)
T = np.eye(4)
T[:3, :3] = R
return T
def rotate3_R(R):
T = np.eye(4)
T[:3, :3] = R
return T
def rotate3(axis, **kwargs):
if isinstance(axis, np.ndarray) and axis.ndim == 2:
return rotate3_R(R=axis)
else:
return rotate3_axis(axis, **kwargs)
def scale3(s=1, sx=1, sy=1, sz=1):
return np.diag([s*sx, s*sy, s*sz, 1.0])
axis_names = ['x', 'y', 'z']
plane_names = ['yz', 'xz', 'xy']
def show(volume, gradient=False):
vmin = volume.min()
vmax = volume.max()
fig, axes = plt.subplots(figsize=(10, 10), nrows=1, ncols=3)
for k, axis in enumerate(axes):
axis_slice = np.take(volume, indices=volume.shape[k] // 2, axis=k)
axis.set_title(f"{axis_names[k]}-axis, {plane_names[k]}-plane")
axis.set_xlabel(plane_names[k][1])
axis.set_ylabel(plane_names[k][0])
axis.imshow(axis_slice, cmap="gray", vmin=vmin, vmax=vmax)
plt.tight_layout()
plt.show()
# phantom
vsize = np.array([50, 50, 50])
# box with different sizes in each dimension
box_size = np.array([40, 20, 10], dtype=int)
box_p0 = (vsize - box_size) // 2
box_p1 = box_p0 + box_size
volume = np.zeros(vsize, dtype=np.float32)
(bx0, by0, bz0), (bx1, by1, bz1) = box_p0, box_p1
volume[bx0:bx1, by0:by1, bz0:bz1] = 1
Tt = translate3(*(vsize/2))
Tr = rotate3(axis=(0,0,1), degrees=30)
T = Tt @ Tr @ inv(Tt)
warped = ndi.affine_transform(
input=volume,
matrix=T,
# offset ignored
output_shape=vsize,
order=3,
mode='nearest',
)
# per-voxel gradient of the warped volume
def gradients(vol):
gradx = np.gradient(vol, edge_order=2, axis=0)
grady = np.gradient(vol, edge_order=2, axis=1)
gradz = np.gradient(vol, edge_order=2, axis=2)
return np.stack([gradx, grady, gradz], axis=-1)
def gradients_sobel(vol):
gradx = ndi.sobel(vol, axis=0)
grady = ndi.sobel(vol, axis=1)
gradz = ndi.sobel(vol, axis=2)
return np.stack([gradx, grady, gradz], axis=-1)
grad = gradients_sobel(warped)
mag = norm(grad, axis=-1)
# cluster the gradient vectors
# seven clusters
# one per face, and background/zero
gradvectors = grad.reshape((-1, 3))
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=7, random_state=0, n_init=10).fit(gradvectors)
labels = kmeans.labels_
clusters = kmeans.cluster_centers_
print("clusters:")
print(clusters)
(hist, edges) = np.histogram(labels, bins=np.arange(kmeans.n_clusters+1))
print("counts:", hist)
#bg_label = np.argmax(hist) # most common
bg_label = np.argmin(norm(clusters, axis=1)) # closest to zero
print("background label:", bg_label)
face_vecs = np.delete(clusters, bg_label, axis=0)
#face_vecs /= norm(face_vecs, axis=1, keepdims=True)
print("foreground clusters:")
print(face_vecs)
clusters:
[[ -0.0007 -0.0002 0. ]
[ -0.0043 0.00083 -15.45364]
[ -0.0043 0.00083 15.45364]
[ 6.14112 11.05985 0. ]
[ -6.13702 -11.10985 0. ]
[ 10.91968 -6.22587 0. ]
[-11.013 6.04535 0. ]]
counts: [119556 1547 1547 792 778 392 388]
background label: 0
foreground clusters:
[[ -0.0043 0.00083 -15.45364]
[ -0.0043 0.00083 15.45364]
[ 6.14112 11.05985 0. ]
[ -6.13702 -11.10985 0. ]
[ 10.91968 -6.22587 0. ]
[-11.013 6.04535 0. ]]
# it should be three pairs of face vectors, where each pair is opposite faces
# check which vectors are a pair by checking the dot product
def find_opposite_faces(face_vecs):
pairs = []
for i, vec1 in enumerate(face_vecs):
for j, vec2 in enumerate(face_vecs):
if i >= j: continue
dist = np.dot(normalize(vec1), normalize(vec2))
if dist < -0.999:
pairs.append((i, j))
print(f"({i}, {j}) dist {dist}")
return pairs
pairs = find_opposite_faces(face_vecs)
print(pairs)
(0, 1) dist -0.9999999403953552
(2, 3) dist -0.9999975562095642
(4, 5) dist -0.9998694658279419
[(0, 1), (2, 3), (4, 5)]
# pick one vector per pair which should be pointing positively (dot with [1,1,1] is positive)
axis_vecs = np.array([
face_vecs[i1] if np.dot(face_vecs[i1], [1,1,1]) >= 0 else face_vecs[i2]
for (i1, i2) in pairs
])
print(axis_vecs)
[[-0.0043 0.00083 15.45364]
[ 6.14112 11.05985 0. ]
[10.91968 -6.22587 0. ]]
# vectors are normal
# build basis from the three vectors
# for least amount of "accidental" rotation, find closest to X, Y, Z axes
# 1. find the one closest to X axis, remove
# 2. find the one closest to Y axis, remove
# 3. remaining one taken for Z axis
def find_closest(vecs, target):
dists = np.dot(target, vecs.T)
return np.argmax(dists)
remaining = [0, 1, 2]
index = find_closest(axis_vecs[remaining], [1,0,0])
x_axis = axis_vecs[remaining[index]]
remaining.pop(index)
index = find_closest(axis_vecs[remaining], [0,1,0])
y_axis = axis_vecs[remaining[index]]
remaining.pop(index)
z_axis = axis_vecs[remaining[0]]
basis = np.array([x_axis, y_axis, z_axis]).T
print(basis)
[[10.91968 6.14112 -0.0043 ]
[-6.22587 11.05985 0.00083]
[ 0. 0. 15.45364]]
# how close is this to the 30 degrees applied originally?
(rvec, jac) = cv.Rodrigues(basis)
print(np.rad2deg(norm(rvec)))
29.3647
# this basis may not be orthogonal, so we need to adjust it
# Modified Gram-Schmidt process, for numerical stability
# https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process
# does not optimize all vectors together. maintains first vector.
def modified_gram_schmidt(basis):
"""
basis: NxN matrix of column vectors
"""
basis = np.asarray(basis, dtype=np.float64)
basis = basis.copy()
M,N = basis.shape
for i in range(N):
basis[:,i] /= norm(basis[:,i])
for j in range(i+1, N):
basis[:,j] -= np.dot(basis[:,j], basis[:,i]) * basis[:,i]
return basis
mgs_basis = modified_gram_schmidt(basis)
print(mgs_basis)
[[ 0.86872 0.4953 -0. ]
[-0.4953 0.86872 0. ]
[ 0. 0. 1. ]]
# how close is this to the 30 degrees applied originally?
(rvec, jac) = cv.Rodrigues(mgs_basis)
print(np.rad2deg(norm(rvec)))
29.689681597833314
# now rotate the volume to align with the basis
Tt = translate3(*(vsize/2))
Tr = rotate3(mgs_basis)
T_unwarp = Tt @ Tr @ inv(Tt)
unwarped = ndi.affine_transform(
input=warped,
matrix=T_unwarp,
# offset ignored
output_shape=vsize,
order=3,
mode='nearest',
)
The clustering of gradients can be handled in various ways. One could attempt to discard outliers and recalculate the centroids. One could attempt to normalize the vectors, once background gradient (zero) was identified and discarded.
One could even perform (part of) Marching Cubes and fit planes! I think that's enough for tonight.