rustlifetime

How to express these lifetime bounds


I'm writing a "coiterator" trait in rust (and a bunch of adapters for it):

pub trait Consumer<Item> {
    type Output;

    fn eat(&mut self, item: Item) -> Option<()>;

    fn finish(self) -> Self::Output;

    fn pair_with_iterator(mut self, iterator: impl IntoIterator<Item = Item>) -> Self::Output
    where
        Self: Sized,
    {
        for item in iterator {
            if self.eat(item).is_none() {
                break;
            }
        }
        self.finish()
    }

    ...
}

Think of eat as the dual of Iterator::next and finish as the dual of IntoIterator::into_iter. Now, if I have a Consumer<T> and a Consumer<&T> then I can combine them to process the same items in parallel. For example, you might have one type that collects an iterator into a HashSet and another that just counts the number of items. I know how to write that using just iterators but consider the following general implementation:

struct Parallel<C, D> {
    by_move: C,
    by_ref: D,
}

// this is wrong, T should not need to live for 'static
impl<T: 'static, C: Consumer<T>, D: for<'a> Consumer<&'a T>> Consumer<T> for Parallel<C, D> {
    type Output = (C::Output, <D as Consumer<&'static T>>::Output);

    fn eat(&mut self, item: T) -> Option<()> {
        self.by_ref.eat(&item)?;
        self.by_move.eat(item)
    }

    fn finish(self) -> Self::Output {
        (self.by_move.finish(), self.by_ref.finish())
    }
}

But as in the comment, the lifetime bounds are not what they should be. The underlying issue is that there's no way (I think) to represent that <D as Consumer<&'a T>>::Output should not depend on the lifetime 'a. I have also considered the following fixes, but I don't think any of these work:

Is there a way to write a more general version of this impl that doesn't need the bound on T?


Solution

  • I believe you just need to place a lifetime on either Consumer or Parallel so that you have a named lifetime to use in the definition of Output. Parallel probably makes more sense because it has an intrinsic (albeit hidden) lifetime, whereas if the lifetime were placed on Consumer, every implementer would need to reference it even if they didn't make use of it.

    use std::{collections::HashSet, hash::Hash, marker::PhantomData};
    
    pub trait Consumer<Item> {
        type Output;
    
        fn eat(&mut self, item: Item) -> Option<()>;
    
        fn finish(self) -> Self::Output;
    
        fn pair_with_iterator(mut self, iterator: impl IntoIterator<Item = Item>) -> Self::Output
        where
            Self: Sized,
        {
            for item in iterator {
                if self.eat(item).is_none() {
                    break;
                }
            }
            self.finish()
        }
    }
    
    struct Parallel<'a, C, D> {
        by_move: C,
        by_ref: D,
        phantom: PhantomData<&'a ()>,
    }
    
    impl<'a, T: 'a, C: Consumer<T>, D: for<'b> Consumer<&'b T>> Consumer<T> for Parallel<'a, C, D> {
        type Output = (C::Output, <D as Consumer<&'a T>>::Output);
    
        fn eat(&mut self, item: T) -> Option<()> {
            self.by_ref.eat(&item)?;
            self.by_move.eat(item)
        }
    
        fn finish(self) -> Self::Output {
            (self.by_move.finish(), self.by_ref.finish())
        }
    }
    
    impl<T: Hash + Eq> Consumer<T> for HashSet<T> {
        type Output = Self;
    
        fn eat(&mut self, item: T) -> Option<()> {
            self.insert(item);
            Some(())
        }
    
        fn finish(self) -> Self::Output {
            self
        }
    }
    
    impl<T> Consumer<T> for usize {
        type Output = Self;
    
        fn eat(&mut self, _item: T) -> Option<()> {
            *self += 1;
            Some(())
        }
    
        fn finish(self) -> Self::Output {
            self
        }
    }
    
    fn main() {
        let v = vec![0, 1, 2, 2, 0];
        let par = Parallel {
            by_move: HashSet::new(),
            by_ref: 0_usize,
            phantom: PhantomData,
        };
        let (set, count) = par.pair_with_iterator(v.iter());
        println!("saw {count:?} items; result is {set:?}");
    }
    
    saw 5 items; result is {1, 2, 0}