I have the task to find the center, the side length and the rotation angle of a square in a binary image. In addition i have the following information given:
Given a binary image with white (=True) background containing a single black (=False) rotated square, return the center of mass (x, y) of said square, its edge length (l) and its rotation angle (alpha). The square will always be fully contained inside the image, i.e. none of its edges intersects with the image border.Return the angle alpha
in degrees, where 0° is upright and equivalent to ..., -180°, -90°, 90°, 180°, ...and the positive direction of rotation is counter-clockwise. Please note the shape of img
is (height, width) with y-axis first (row-major order). This task should be solvable with only standard numpy functions.
Im not really familiar with numpy but I think I was able to get the center of the square in x and y coordinates and side length with following code:
import numpy
def find_rect(img):
x, y, l, alpha = 0, 0, 0, 0
black_pixels = np.argwhere(img==0)
y, x = np.mean(black_pixels, axis=0)
l = np.sqrt((img==0).sum())
y_max1 = black_pixels[black_pixels[:,1]>= x, 0].max()
y_max2 = black_pixels[black_pixels[:,1]<= x, 0].max()
y_max = max(y_max1, y_max2)
x_max = black_pixels[black_pixels[:, 0] == y_max, 1][0]
alpha = np.degrees(np.arctan(((y_max - y)/(x_max - x))))
return x, y, l, alpha
#the code which generates the square
from scipy.ndimage import rotate
def generate_rot_square(w, h, x, y, l, alpha):
rect = np.ones((l, l))
rect = rotate(rect, alpha)
r_h, r_w = rect.shape
r_h_1 = r_h // 2
r_h_2 = r_h - r_h_1
r_w_1 = r_w // 2
r_w_2 = r_w - r_w_1
result = np.zeros((h, w), dtype=bool)
result[y - r_h_1:y + r_h_2, x - r_w_1:x + r_w_2] = rect > 0.9
return 1 - result
But as soon as the square is rotated the code to get the side length if the square won't work anymore. Also I dont really have an idea on how to get the rotation angle of the square. Maybe someone can give me tips on how to solve this task.
This is quite scholar, so unrealistic (in real life) solutions are probably expected. Not computer vision solutions (that would use opencv rather than just numpy).
For example, since you know for sure that this is a square, you know that the number of black pixels is l². So count the number of black pixels, and take the square root of that
l=np.sqrt((img==0).sum())
To find the angle, ask yourself, what should be ymax for angle α? Find ymax, as you did, and deduce angle from that.
You have to be able to find angle to distinguish in reality the case where ymax occurs in first or second quadrand (or third/fourth, depending on y-axis being upward or downward. But there are only 2 cases to consider)
So, compute ymax1 for the first half of image (x-wise), and ymax2 for the other
ymax1=black_pixels[black_pixels[:,1]>=x, 0].max()
ymax2=black_pixels[black_pixels[:,1]<=x, 0].max()
Only the bigger one is relevant. Since the smaller one means that the max is not a corner, but the intersection of a side with y-axis (well, you could compute from that intersection too, but that is more complicated).
So, the bigger one tells you were is a corner. From the position of a corner and l
you can deduce the angle.
I've tried it. It is not very accurate (error is around 5 degs for angle near 90; better for smaller angles). Even l, is 149 when I try with 150. That is because of the >0.9
that filters out more pixels than it should.
So, if that >0.9
is a constraint, it would probably be more accurate to avoid relying on l
to compute the angle. For that, also compute the x matching you ymax (warning: it is not a xmax. Just draw a square and see that it is not the same thing. It should be the only black pixel in the row ymax: img[ymax].argmin()
. Or, the mean of black pixels in that row, if there are more than one (but, except for angle 0, that should be just a few). And then the ratio of ymax and that x should be related to the tan
of the angle.
I don't say more, since you asked only for tips.
But the kind of computation you are already doing (mean, max, min, ...) of black pixels, in addition to the separation I've mentioned (max before x, max after x), or the corner coordinates (x that match ymax, or likewise, redundantly, y that match xmax) are all what you need. Just think backward, of what would be those values for a square centered at x₀,y₀
, of side W
and rotated by angle α. And then reverse the equation.
Try with different strategies :
l
.tan(angle)
from thatl
y
that match xmax
(thus a corner) and deduce (with arctan
also) angle from thatMaybe you'll see that depending on the estimated angle, one strategy is more accurate, and you'll end up with a way to select the better of those 4 strategies.
And there are others: ymax-ymin, xmax-xmin also depends from angle. So if you can tell what they should be for a given angle, then you can reverse the formula and find what the angle is from their measured value
Some precision about corners and angle.
In a square, you have 4 corners. Which are at coordinates
(x+W×√2/2×cos(β), y+W×√2/2×sin(β))
(x+W×√2/2×cos(β+90°), y+W×√2/2×sin(β+90°))
(x+W×√2/2×cos(β+180°), y+W×√2/2×sin(β+180°))
(x+W×√2/2×cos(β+270°), y+W×√2/2×sin(β+270°))
β begin α+45°, α being the angle you search for. It is more convenient for me to reason from the angle, β, where we find corners. So if square is not rotated, α=0, then β=45°. So, coordinates of corners are (x+W√2×cos(45°)=x+W×√2/2×√2/2 = x+W/2, y+W/2
x-W/2, y+W/2
x-W/2, y-W/2
x+W/2, y-W/2
As you can see, that is quite expected.
If α=45°, then β=90°, and we get
x, y+W√2/2
x-W√2/2, y
x, y-W√2/2
x+W√2/2, y
Again, quite as expected.
It doesn't really matter which corner we use: angle, as you said, is valid modulo 90°, and, obviously, corners are separated from each other by angles mutiples of 90°
So, we just need to find a corner, whichever.
So, if you look at ymax, that is the y
of one corner. Unless angle is α=0, in which case, it is the y
of a whole side.
What is the matching x
: it is the x
where you find a 0 in row ymax.
You can find it that way for example (using your black_pixels
array)
xc=black_pixels[black_pixels[:,0]==ymax,1][0]
Note that black_pixels[black_pixels[:,0]==ymax,1]
returns an array of x
. So I just take the first (see later about that).
Likewise, working with image directly, I could also have computed
xc=img[ymax].argmin()
Which returns the index (in x, since we already indexed first axis with ymax
) of the first 0 in ymax
row.
Both method would fail if there is no 0 in that row. But there is one. Otherwise, ymax
would have been different.
So, no, you know one corner, whose coordinate is (xc, ymax).
That point is,relative to the center of the square (and center of rotation) with vector (xc-x, ymax-y).
The angle of that vector is arctan((ymax-y)/(xc-x))
. And since that is the angle where you find a corner, that is what I've called β, we need to remove 45°. So angle is arctan((ymax-y)/(xc-x))-45
.
At least if the corner we are working on is on the right side.
Or 90-that
since y axis is downward.
Just play with different angles, and see how arctan((ymax-y)/(xc-x))
evolves...
def computeFeatures(img):
# from your code
black_pixels=np.argwhere(img==0)
# number of black pixels
n=len(black_pixels)
# Hence, side of the square
W=np.sqrt(n)
# Center of the square
y,x=black_pixels.mean(axis=0)
# Let's now try to find the top corner
# Since y axes is downward for an image (opencv rotate rotates
# in trigonometric direction... if image is plot with y axis downward)
# the top corner is ymin
# (we could compute with ymax, as mentioned. Just then we have
# to reason with bottom corner. It's a bit the same, with
# just a minus sign. But's it's easier to split explanation with ymin
_,ymin=black_pixels.min(axis=0) # (I don't use xmin)
# xc1 is the first x that is black in ymin row
xc1=img[ymin].argmin()
# xc2 is the last x that is black
# Note that with angles around 45, the corner is quite "sharp"
# so xc1 and xc2 are likely equal
# but for "almost horizonal" (angle close to 0 or 90)
# then, we have some "aliasing" and there are more than 1 pixel on
# the last row
# not much, tho, so that is clearly polishing.
# Except for the caricatural case of angle 0. In which case
# xc1 or xc2 would give the same result (just 2 different corners)
xc2=img.shape[1]-1-img[ymin,::-1].argmin()
if xc2>x:
# If the corner is after x, that is on the right of the image
# Then we prefer xc2 rather than xc1
# (again, almost the same thing. But the corner is then the rightmost
# point of the top row)
# Then relative to center x,y, the vector center→top corner
# is (xc2-x),(y-ymin),
# (with positive direction upward. So not exactly in
# the coordinates of the image, but in the one of
# a trigonometric circle
# So angle of top corner (in the trigonometric circle) is
β=np.arctan((y-ymin)/(xc2-x))*180/np.pi # *180/pi convert in degs
# So angle of rotation, since we expect the top corner
# to be at 45° when rotation is 0, and, of course
# to rotate then relatively with the rest of the square
# is 45° less than angle of top corner
α=β-45
else:
# Otherwise, we are dealing with a top corner that is on the
# left of the image
# it is more or less the same thing.
# Except that arctan (which is ambiguous as you know)
# would give the angle of the opposite corner
# (btw, it is a valid way to express rotation. But it is
# probably better to show rotation between 0 and 90°
# we could have done that in a single computation
# and just %90 the result. But it is easier when
# trying to understand to separate the 2 cases
# Therefore, since we have not the top corner, but the
# opposite one, we have to add 180°, and then remove 45°
# like in the other case
β=np.arctan((y-ymin)/(xc1-x))*180/np.pi
α=β+180-45
return x,y,W,α
"Proof" that it works
import matplotlib.pyplot as plt
realAngles = np.arange(90)
estimatedAngles = [computeFeatures(generate_rot_square(1000,1000,500,500,700,a))[3] for a in realAngles]
plt.plot(realAngles, estimatedAngles)
plt.plot(realAngles, realAngles)
plt.show()
Real and estimated angles overlap pretty well.