rustiteratortraitslazy-evaluationiterator-traits

How can I avoid unnecessary expensive operations in my iterator implementation when some values are ignored?


I have an Iterator implementation that looks like this:

struct MyType {
    // stuff
}

struct Snapshot {
    // stuff
}

impl MyType {
    pub fn iter(&mut self) -> MyIterator {
        MyIterator {
            foo: self
        }
    }
}

struct MyIterator<'a> {
    foo: &'a mut MyType
}

impl<'a> Iterator for MyIterator<'a> {
    type Item = Snapshot;

    fn next(&mut self) -> Option<Self::Item> {
        if self.cheap_condition() {
           self.cheap_mutation();  // Inexpensive
           let s: Snapshot = self.snapshot();  // Expensive copying
           Some(s)
        } else {
           None
        }
    }
}

This works fine if I want to use every instance of Snapshot generated, but if I want to use something like Iterator::step_by or Iterator::last, where I don't care about the intermediate Snapshots, the cost incurred by the generation of those unused values is a massive performance hit.

I could override every single method of Iterator such that the expensive operation only takes place when necessary, but I feel like there must be a simpler and more idiomatic way of doing this, or a way to lazily generate the Snapshot corresponding to an iteration from an intermediate type for Item if Iterator::next hasn't been called again.

I do not want to return references to MyType, because I want the Snapshots that I do generate to have independent lifetimes, enabling, for example, the result of Iterator::collect() to outlive the instance of MyType.


Solution

  • You don't have to override every Iterator method. The only relevant ones are nth, last, count, step_by, and skip. All the other methods require examining Self::Item in some way so you can't avoid generating Snapshots for those. Luckily step_by and skip use nth internally, so that really only leaves us with nth, count, and last which we must override:

    use core::marker::PhantomData;
    
    struct Snapshot;
    
    struct MyType<'a> {
        item: PhantomData<&'a ()>,
    }
    
    impl<'a> MyType<'a> {
        fn cheap_condition(&self) -> bool {
            todo!()
        }
        fn cheap_mutation(&mut self) {
            todo!()
        }
        fn snapshot(&self) -> Snapshot {
            Snapshot
        }
    }
    
    impl<'a> Iterator for MyType<'a> {
        type Item = Snapshot;
    
        fn next(&mut self) -> Option<Self::Item> {
            if self.cheap_condition() {
                self.cheap_mutation(); // Inexpensive
                Some(self.snapshot()) // Expensive
            } else {
                None
            }
        }
    
        // also covers skip & step_by methods
        fn nth(&mut self, n: usize) -> Option<Self::Item> {
            for _ in 0..n {
                if self.cheap_condition() {
                    self.cheap_mutation();
                } else {
                    return None;
                }
            }
            self.next()
        }
    
        fn count(mut self) -> usize
        where
            Self: Sized,
        {
            let mut count: usize = 0;
            while self.cheap_condition() {
                self.cheap_mutation();
                count += 1;
            }
            count
        }
    
        fn last(mut self) -> Option<Self::Item>
        where
            Self: Sized,
        {
            while self.cheap_condition() {
                self.cheap_mutation();
                if !self.cheap_condition() {
                    return Some(self.snapshot());
                }
            }
            None
        }
    }
    

    playground

    If you want to make other Iterator methods like filter lazy by checking the predicate against MyType<'a> instead of Snapshot you have to define your own Iterator-like trait for that, as all the other methods only work on Self::Item.