pythonpython-3.xpython-datamodel

Python: Filter iterable class


Is there a hook/dunder that an Iterable object can hold so that the builtin filter function can be extended to Iterable classes (not just instances)?

Of course, one can write a custom filter_iter function, such as:

def filter_iter(filt_func: callable, collection_cls: type):
    name = 'Filtered' + collection_cls.__name__  # would this automatic scheme lead to namespace conflicts?
    wrapped_cls = type(name, (collection_cls,), {'_filt_func': staticmethod(filt_func)})
    def __iter__(self):
        yield from filter(self._filt_func, super(wrapped_cls, self).__iter__())
    wrapped_cls.__iter__ = __iter__
    return wrapped_cls

which would have the desired effect. For example,

from collections import Collection, Iterable
class Chunker(Iterable):
    def __init__(self, source: Iterable, chk_size: int=2):
        self._source = source
        self._chk_size = chk_size
    def __iter__(self):
        yield from zip(*([iter(self._source)] * self._chk_size))


chunker = Chunker(range(12), 2)
assert list(chunker) == [(0, 1), (2, 3), (4, 5), (6, 7), (8, 9), (10, 11)]
FilteredChunker = filter_iter(lambda x: sum(x) % 3 == 0, Chunker)
filtered_chunker = FilteredChunker(range(12))
assert list(filtered_chunker) == [(4, 5), (10, 11)]

But, just as there's an __iter__ hook that determines how to iterate over an object (for example, how list should behave when called on the object), is there a sort of __filter__ hook to determine how filter should behave when called on that object?

If not, what are the best practices or standards around filtering iterables?


Solution

  • Unlike with list (and __iter__ for instance), there is no such hook for filter. The latter is just an application of the iterator protocol, not a separate protocol in and of itself.

    To not leave you empty handed, here is a more concise version of the filtered_iter you proposed, that dynamically subclasses the given class, composing its __iter__ method with filter.

    def filter_iter(p, cls):
        class _(cls):
            def __iter__(self):
                yield from filter(p, super().__iter__())
        return _