rustrayon

Using rayon's parallel iterators for a tree object


I have a Vec<ValidRelationshipTree> and I would like to use rayon's parallel iteration to process each tree in parallel. I cannot do so as my ValidRelationshipTree doesn't satisfy std::marker::Sized.

Definition of ValidRelationshipTree, whose size is clearly known at compile time:

// size = 16 (0x10), align = 0x8
struct ValidRelationshipTree {
    root: RefCell<Rc<ValidRelationshipNode>>,
}

Compilation error:

the method `into_par_iter` exists for struct `Vec<ValidRelationshipTree>`, but its trait bounds were not satisfied
the following trait bounds were not satisfied:
`[metadata::relationships::ValidRelationshipTree]: std::marker::Sized`
which is required by `[metadata::relationships::ValidRelationshipTree]: rayon::iter::IntoParallelIterator`
`[metadata::relationships::ValidRelationshipTree]: rayon::iter::ParallelIterator`
which is required by `[metadata::relationships::ValidRelationshipTree]: rayon::iter::IntoParallelIterator`

Since my structure's size is known at compile time, I don't understand how std::marker::Sized isn't implemented. I've tried to read through this trait's documentation, researching similar rayon issues, but I just don't understand. I even tried creating a wrapper struct around my Vec<ValidRelationshipTree> but no dice.

If anyone could help fill in my gap of knowledge that would be greatly appreciated <3.

In case it's relevant, here is the definition of ValidRelationshipNode:

struct ValidRelationshipNode {
    payload: String,
    extended: String,                             
    parent: RefCell<Weak<ValidRelationshipNode>>, // Use Weak<T> to avoid reference cycles
    children: RefCell<Vec<Rc<ValidRelationshipNode>>>,
}

Concrete code example

The following sequential block compiles just fine:

let forest: Vec<ValidRelationshipTree> = self.generate_relationship_trees(n_tables, true);

let mut extended: Vec<String> = forest
    .into_iter()
    .flat_map(|tree: ValidRelationShipTree| {
        tree.nodes_with_length(n_tables)
            .into_iter()
            .map(|node: Rc<ValidRelationshipNode>| node.extended().clone())
    })
    .collect();

And a trivial test case that errors:

let forest: Vec<ValidRelationshipTree> = self.generate_relationship_trees(n_tables, true);

let extended_par: Vec<String> = forest
    .into_par_iter() // ERROR: the following trait bounds were not satisfied:
                     // [metadata::relationships::ValidRelationshipTree]: std::marker::Sized`
    .map(|node| node.extended().clone())
    .collect();

Solution

  • If you instead call into_par_iter as an associated function on Vec, you get a more helpful error:

    Vec::into_par_iter(forest)
    
    error[E0277]: `Rc<ValidRelationshipNode>` cannot be sent between threads safely
      --> src/lib.rs:44:5
       |
    44 |     Vec::into_par_iter(forest)
       |     ^^^ `Rc<ValidRelationshipNode>` cannot be sent between threads safely
       |
       = help: the trait `Send` is not implemented for `Rc<ValidRelationshipNode>`
       = help: the following other types implement trait `rayon::iter::IntoParallelIterator`:
                 Vec<T>
                 &'data Vec<T>
                 &'data mut Vec<T>
       = note: required for `RefCell<Rc<ValidRelationshipNode>>` to implement `Send`
    note: required because it appears within the type `ValidRelationshipTree`
      --> src/lib.rs:5:12
       |
    5  | pub struct ValidRelationshipTree {
       |            ^^^^^^^^^^^^^^^^^^^^^
       = note: required for `Vec<ValidRelationshipTree>` to implement `rayon::iter::IntoParallelIterator`
    

    The simplest way to get around this is to use Arc instead of Rc and Mutex or RwLock instead of RefCell.

    use std::sync::Mutex;
    use std::sync::{Arc, Weak};
    
    struct ValidRelationshipTree {
        root: Mutex<Arc<ValidRelationshipNode>>,
    }
    
    struct ValidRelationshipNode {
        payload: String,
        extended: String,
        parent: Mutex<Weak<ValidRelationshipNode>>,
        children: Mutex<Vec<Arc<ValidRelationshipNode>>>,
    }
    

    Then this code probably works, depending on the other functions involved:

    forest
        .into_par_iter()
        .flat_map(|tree| {
            tree.nodes_with_length(n_tables)
                .into_par_iter()
                .map(|node| node.extended().clone())
        })
        .collect()