i'm trying to parallelize the following function:
pub fn encode(&self, s: &String) -> String {
s.chars()
.par_iter() // error here
.map(|c| Character::try_from(c))
.enumerate()
.map(|(n, c)| match c {
Ok(plain) => self.encode_at(plain, n).into(),
Err(e) => match e {
ParsingError::Charset(non_alphabetic) => non_alphabetic,
_ => unreachable!(),
},
})
.collect()
}
I get the following error when trying to go from the Chars iterator into a parallel iterator:
the method
par_iter
exists for structstd::str::Chars<'_>
, but its trait bounds were not satisfied
the following trait bounds were not satisfied:
&std::str::Chars<'_>: IntoParallelIterator
which is required bystd::str::Chars<'_>: rayon::iter::IntoParallelRefIterator
rustcE0599
I would expect that converting an iterator into a parallel iterator would be fairly trivial but apparently not
The problem is that characters in UTF-8 have variable size - ASCII characters take one byte but other ones take two to four bytes. This makes splitting up a string for parallel processing problematic, since the middle byte in the string array may not be the actual middle of the string, and may even be in the middle of a character.
That said, that should not make parallel processing impossible. It's not critical that the string be evenly split among workers, and you can find the start or end of a multi-byte character in the middle of a UTF-8 sequence if you know how they are encoded.
So at least in theory you could iterate in parallel over a string. I'm guessing the rayon authors haven't implemented it because it's not a common use case and it's somewhat tricky to do.