I have a (big) boolean array and I'm looking for a way to fill True
where it merges two sequences of True
with minimal length.
For example:
a = np.array([True] *3 + [False] + [True] *4 + [False] *2 + [True] *2)
# a == array([ True, True, True, False, True, True, True, True, False, False, True, True])
closed_a = close(a, min_merge_size=2)
# closed_a == array([ True, True, True, True, True, True, True, True, False, False, True, True])
Here the False
value in index [3]
is converted to True
because on both sides it has a sequence of at least 2 True
elements. Conversely, elements [8]
and [9]
remain False
because the don't have such a sequence on both sides.
I tried using scipy.ndimage.binary_closing with structure=[True True False True True]
(and with False
in the middle) but it doesn't give me what I need.
Any ideas?
This one was tough, but I was able to come up with something using itertools and more_itertools.
Similar to what you had, essentially, the idea is to take consecutive windows on the array, and just directly check if that window contains the indicator sequence of n * True, False, n * True
.
from itertools import chain
from more_itertools import windowed
def join_true_runs(seq, min_length_true=2):
n = min_length_true
sentinel = tuple(chain([True] * n, [False], [True] * n))
indecies = [
i + n for i, w in enumerate(windowed(seq, 2 * n + 1)) if w == sentinel
]
seq = seq.copy() #optional
seq[indecies] = True
return seq
You should probably write some tests to check for corner cases, though it does seem to work on this test array:
arr = np.array([True, True, True, False, True, True, True, False, True, False, True, True, False, False, True, True])
# array is unchanged
assert all(join_true_runs(arr, 4) == arr)
# only position 3 is changed
list(join_true_runs(arr, 3) == arr)
# returns [True,
# True,
# True,
# False,
# True,
# ...
# ]
Of course if you want to mutate the original array instead of returning a copy you can do that too.