I'm working on an image-processing library that divides an image and analyses it part by part.
I have an image of a given height H
and width W
(in pixels), and want to divide it by a factor of n
(n
parts horizontally and n
parts vertically, so a total of n**2
parts). For example, a 320x240 image with n = 6
would look something like this: .
So, in short, I want a function f(x, y)
that can tell me to which of these rectangles a given coordinate or pixel belongs. E.g., f(0, 0) == 0
, f(160, 120) == 21
, f(319, 239) == 35
; alternatively, the coordinates/offsets of the containing rectangle: f(0, 0) == {0, 0}
, f(160, 120) == {3, 3}
, f(319, 239) == {5, 5}
. Coordinates always start from the upper left corner at {0, 0} (the usual way for image manipulation libraries I've seen), and x
and y
are always integers, though of course they may be converted afterwards.
For me, the obvious solution would be to get the width and height of a single inner rectangle (w
and h
in the diagram) and do two simple divisions to get the coordinates.
rectangle.h := rectangle.H / n
rectangle.w := rectangle.W / n
func f(x, y) {
xCoord := floor(x / rectangle.w)
yCoord := floor(y / rectangle.h)
return {xCoord, yCoord}
}
I think this is mathematically correct for the problem (tell me if it's not), but I'm not sure how this would interact with different concrete programming types. For example, for f(319, 239)
, if I were to save rectangle.h
as an integer, no problem, xCoord == 5
. But what happens with xCoord
? If rectangle.h
was an integer, then rectangle.h == 53
, and floor(319 / rectangle.h) == 6
, which is an overflow. If everything is floating points, then rectangle.h == 53.333333333333336
and floor(319 / rectangle.h) == 5
as I want, but now I'm worried about floating point errors in some corner case I haven't thought of. This is not my area of expertise, so I don't know if the concern is reasonable. I could also just take out the big guns and use a Decimal or Rational type, but this being an image library that I expect will see heavy use (this is a hobby project, so no hard requirements, but at worse it may expect batch processing of hundreds of images), I want to make it as efficient as reasonable and stick to basic data types if possible.
I could also just store all the rectangles as a structure and linearly check if the given coordinates are in any of them, but that seems inefficient and unnecessarily brutish for this problem. Similar problems I've found suggest the use of R-trees but, on the contrary, that seems overkill for this version of the problem.
Basically, I'm asking if there's a non-obvious way to solve this problem while keeping the code efficient, given its purpose. If there's a standard or "good practice" for it, that would be ideal. Also, as I said, I'm not an expert in either image manipulation software nor in floating point arithmetic. Am I being paranoid about the possibility of floating point errors? Would using a bignum-like datatype be too much of a cost for this application?
In case it's relevant, this is in Go, but I'm looking for agnostic answers, if at all possible.
You have a big range, say N = 100
, that you split into N = 7
equal-length segments, each supposedly being W/N
long.
You have a position in that range, i
, and you want to know in which segment that falls.
You would calculate i / (W/N) = i * (N/W)
, which is bad, or the improved expression
(i * N) / W
This is improved because the division, being the critical operation that introduces numerical error, comes last. The multiplication is exact and won't introduce error.
You should do that with integer arithmetic, because that floor
s implicitly. Now you have the index of that interval, ranging from 0
to N-1
.
This is arithmetically correct.
If you needed to know the bounds of those intervals, sure that can also be done.
You want the sum of those pieces to equal the whole width: (W / n) * N = W
If you just calculated W/n
and summed or multiplied that, regardless of floating point or integer arithmetic, that has a chance of not hitting the target W
.
You should do the division last, the summing/multiplication first.
You have a choice between integer and floating point arithmetic, which is mostly a difference in rounding.
Here's some Python to demonstrate the arithmetic.
>>> W = 100; N = 7
>>> W/N
14.285714285714286
>>> K = np.arange(N+1); K
array([0, 1, 2, 3, 4, 5, 6, 7])
>>> (K * W) // N; np.diff(_)
array([ 0, 14, 28, 42, 57, 71, 85, 100])
array([14, 14, 14, 15, 14, 14, 15])
>>> (K * W / N).round().astype(int); np.diff(_)
array([ 0, 14, 29, 43, 57, 71, 86, 100])
array([14, 15, 14, 14, 14, 15, 14])
If you needed to iteratively sum things, you could do this:
>>> acc = 0
>>> for k in K:
... print(f"{acc // N:3d} {round(acc / N):3d}")
... acc += W
0 0
14 14
28 29
42 43
57 57
71 71
85 86
100 100