rustrayon

How to satisfy Rayon's trait bounds in matrix multiplication algorithm?


I implemented matrix multiplication using functional approach.

struct Matrix {
  vals: Vec<i32>,
  rows: usize,
  cols: usize,
}

fn mul(a: &Matrix, b: &Matrix) -> Matrix {
  assert_eq!(a.cols, b.rows);
  let vals = a
    .vals
    .par_chunks(a.cols)
    .map(|row| {
      (0..b.cols).map(|i| {
        row
          .iter()
          .zip(b.vals.iter().skip(i).step_by(b.cols))
          .map(|(a, b)| a * b)
          .sum()
      })
    })
    .flatten()
    .collect();
  Matrix {
    vals,
    rows: a.rows,
    cols: b.cols,
  }
}

I tried to parallelize this algorithm by using par_chunks from Rayon 1.8.1 instead of chunks, so each row is constructed in a separate thread. With chunks the algorithm works, but with par_chunks it results in two errors:

error[E0277]: the trait bound `std::iter::Map<std::ops::Range<usize>, {closure@src\main.rs:145:23: 145:26}>: rayon::iter::ParallelIterator` is not satisfied
   --> src\main.rs:153:6
    |
153 |     .flatten()
    |      ^^^^^^^ the trait `rayon::iter::ParallelIterator` is not implemented for `std::iter::Map<std::ops::Range<usize>, {closure@src\main.rs:145:23: 145:26}>`

and

error[E0599]: the method `collect` exists for struct `Flatten<Map<Chunks<'_, i32>, {closure@main.rs:144:10}>>`, but its 
trait bounds were not satisfied
   --> src\main.rs:154:6
    |
141 |     let vals = a
    |  ______________-
142 | |     .vals
143 | |     .par_chunks(a.cols)
144 | |     .map(|row| {
...   |
153 | |     .flatten()
154 | |     .collect();
    | |     -^^^^^^^ method cannot be called due to unsatisfied trait bounds
    | |_____|
    | 
    |
   ::: C:\rust\.rustup\toolchains\stable-x86_64-pc-windows-gnu\lib/rustlib/src/rust\library\core\src\iter\adapters\map.rs:62:1
    |
62  |   pub struct Map<I, F> {
    |   -------------------- doesn't satisfy `_: IntoParallelIterator`
    |
   ::: C:\rust\.cargo\registry\src\index.crates.io-6f17d22bba15001f\rayon-1.8.1\src\iter\flatten.rs:11:1
    |
11  |   pub struct Flatten<I: ParallelIterator> {
    |   ---------------------------------------
    |   |
    |   doesn't satisfy `_: Iterator`
    |   doesn't satisfy `_: ParallelIterator`

I have no idea what is going on. I suspect that it might be somehow related to sharing second matrix between threads but it is read-only and is guaranteed to stay in the scope until threads have finished their jobs. Error messages don't help at all. What am I doing wrong? How do I even read this error message?

UPD. Rayon's prelude is included.


Solution

  • The error messages are not great here (the content is mostly correct but the location is misleading), but luckily there's only a few things that can go wrong when Rayon's trait bounds are not satisfied.

    Here, flatten expects to find items that implement ParallelIterator. Instead, your map produces items which are a regular Iterator type. You can fix this by inserting an into_par_iter:

    .map(|row| {
        (0..b.cols).into_par_iter().map(|i| {
            row.iter()
                .zip(b.vals.iter().skip(i).step_by(b.cols))
                .map(|(a, b)| a * b)
                .sum()
        })
    })
    

    Rayon also has the flatten_iter method that is similar to flatten but works on regular Iterator types. This works well if the outer iterator already has enough items to occupy all your cores.

    .map(|row| {
        (0..b.cols).map(|i| {
            row.iter()
                .zip(b.vals.iter().skip(i).step_by(b.cols))
                .map(|(a, b)| a * b)
                .sum()
        })
    })
    .flatten_iter()
    

    And finally, there's flat_map_iter, which just combines map and flatten_iter into one function.

    .flat_map_iter(|row| {
        (0..b.cols).map(|i| {
            row.iter()
                .zip(b.vals.iter().skip(i).step_by(b.cols))
                .map(|(a, b)| a * b)
                .sum()
        })
    })
    // no .flatten()