I'm writing a "coiterator" trait in rust (and a bunch of adapters for it):
pub trait Consumer<Item> {
type Output;
fn eat(&mut self, item: Item) -> Option<()>;
fn finish(self) -> Self::Output;
fn pair_with_iterator(mut self, iterator: impl IntoIterator<Item = Item>) -> Self::Output
where
Self: Sized,
{
for item in iterator {
if self.eat(item).is_none() {
break;
}
}
self.finish()
}
...
}
Think of eat
as the dual of Iterator::next
and finish
as the dual of IntoIterator::into_iter
. Now, if I have a Consumer<T>
and a Consumer<&T>
then I can combine them to process the same items in parallel. For example, you might have one type that collects an iterator into a HashSet
and another that just counts the number of items. I know how to write that using just iterators but consider the following general implementation:
struct Parallel<C, D> {
by_move: C,
by_ref: D,
}
// this is wrong, T should not need to live for 'static
impl<T: 'static, C: Consumer<T>, D: for<'a> Consumer<&'a T>> Consumer<T> for Parallel<C, D> {
type Output = (C::Output, <D as Consumer<&'static T>>::Output);
fn eat(&mut self, item: T) -> Option<()> {
self.by_ref.eat(&item)?;
self.by_move.eat(item)
}
fn finish(self) -> Self::Output {
(self.by_move.finish(), self.by_ref.finish())
}
}
But as in the comment, the lifetime bounds are not what they should be. The underlying issue is that there's no way (I think) to represent that <D as Consumer<&'a T>>::Output
should not depend on the lifetime 'a
. I have also considered the following fixes, but I don't think any of these work:
Item
an associated type rather than an input type to Consumer
. Then you just cannot have D
that accepts &'a T
for multiple 'a
and I don't see how to implement eat
for Parallel
.D as Consumer<&'static T>
could have, instead of 'static
, whatever the lifetime of T
is, but I again think there's no way to express that. Similarly for a hypothetical 'zero
or 'temp
lifetime (the opposite of 'static
) so that T: 'zero
would be automatic for any T
.'static
by some generic lifetime 'b
, but then it would be unconstrained.Is there a way to write a more general version of this impl
that doesn't need the bound on T
?
I believe you just need to place a lifetime on either Consumer
or Parallel
so that you have a named lifetime to use in the definition of Output
. Parallel
probably makes more sense because it has an intrinsic (albeit hidden) lifetime, whereas if the lifetime were placed on Consumer
, every implementer would need to reference it even if they didn't make use of it.
use std::{collections::HashSet, hash::Hash, marker::PhantomData};
pub trait Consumer<Item> {
type Output;
fn eat(&mut self, item: Item) -> Option<()>;
fn finish(self) -> Self::Output;
fn pair_with_iterator(mut self, iterator: impl IntoIterator<Item = Item>) -> Self::Output
where
Self: Sized,
{
for item in iterator {
if self.eat(item).is_none() {
break;
}
}
self.finish()
}
}
struct Parallel<'a, C, D> {
by_move: C,
by_ref: D,
phantom: PhantomData<&'a ()>,
}
impl<'a, T: 'a, C: Consumer<T>, D: for<'b> Consumer<&'b T>> Consumer<T> for Parallel<'a, C, D> {
type Output = (C::Output, <D as Consumer<&'a T>>::Output);
fn eat(&mut self, item: T) -> Option<()> {
self.by_ref.eat(&item)?;
self.by_move.eat(item)
}
fn finish(self) -> Self::Output {
(self.by_move.finish(), self.by_ref.finish())
}
}
impl<T: Hash + Eq> Consumer<T> for HashSet<T> {
type Output = Self;
fn eat(&mut self, item: T) -> Option<()> {
self.insert(item);
Some(())
}
fn finish(self) -> Self::Output {
self
}
}
impl<T> Consumer<T> for usize {
type Output = Self;
fn eat(&mut self, _item: T) -> Option<()> {
*self += 1;
Some(())
}
fn finish(self) -> Self::Output {
self
}
}
fn main() {
let v = vec![0, 1, 2, 2, 0];
let par = Parallel {
by_move: HashSet::new(),
by_ref: 0_usize,
phantom: PhantomData,
};
let (set, count) = par.pair_with_iterator(v.iter());
println!("saw {count:?} items; result is {set:?}");
}
saw 5 items; result is {1, 2, 0}