I have some urls, then i divide it with chunks and iterate them concurrently, saving results in vec using cloning:
pub fn collect_parm_list(url_list: Vec<String>) -> Result<Vec<String>, Box<dyn std::error::Error>> {
let shared_id_list = Arc::new(Mutex::new(vec![]));
for chunk in url_list.chunks(20) {
let mut handles = vec![];
for url in chunk {
let id_list = Arc::clone(&shared_id_list);
let handle = thread::spawn(move || {
// total: optionSale -> result -> total
let mut id_list_v = id_list.lock().unwrap();
id_list_v.push(url.to_string())
});
handles.push(handle);
}
// Wait for all threads to complete
for handle in handles {
handle.join().unwrap();
}
}
let x = shared_id_list.lock().unwrap();
Ok(x.to_vec())
}
I have error with borrowing data:
url_list
does not live long enough borrowed value does not live long enough
at: for chunk in url_list.chunks(20) {
Your problem is that .chunks()
does not return an iterator of owned things, it returns a slice. What that means in this case is that while you have a Vec<String>
, .chunks()
gives you an iterator of &[String]
, which means that url
is a &String
. You then attempt to give this to a closure you pass to thread::spawn
, but closures given to that function are required to be 'static
meaning they do not borrow from any locals. This is the source of the error: the closure you pass to thread::spawn
indirectly borrows from url_list
.
(Aside: The compiler admittedly doesn't do a good job of directing you to the thread::spawn
invocation as the source of the problem. argument requires that "url_list" is borrowed for "'static"
indicates that it does know the problem is the argument to thread::spawn
but I don't know why it doesn't explicitly highlight that invocation. This seems like a bug in the compiler diagnostics to me.)
Note that this is problematic even though you join all of the thread handles. The borrow checker doesn't know that this happens nor what it means; all it knows is that the closure given to thread::spawn
must be 'static
and it's not. (And even if it could know that, the thread running collect_parm_list
could panic when unwrapping the result of .join()
, which would cause it to drop url_list
and then the still-running threads have a use-after-free bug!)
A simple fix would be to add let url = url.to_owned();
before you spawn the thread, which will create an independent copy of url
and then the closure will capture that instead.
Alternatively, if you had some way to inform the borrow checker that the threads cannot outlive url_list
even if the current thread panics then it would be safe. This is exactly what scoped threads accomplish! Using scoped threads you can avoid making a copy of url
, but it also lets you eliminate the Arc
layer from your output list. (You still need the Mutex
to synchronize access.) This also allows you to use .into_inner()
on the Mutex
when the threads are done since you know at that point no threads still have access, so you can consume it without waiting for a lock.
The code for this approach winds up being much simpler, too:
pub fn collect_parm_list(url_list: Vec<String>) -> Vec<String> {
let id_list = Mutex::new(vec![]);
thread::scope(|s| {
for chunk in url_list.chunks(20) {
for url in chunk {
s.spawn(|| {
// total: optionSale -> result -> total
let mut id_list_v = id_list.lock().unwrap();
id_list_v.push(url.to_string())
});
}
}
});
id_list.into_inner().unwrap()
}