I have an array
myArray = array(url1,url2,...,url90)
I want to execute this commande 3 times in parallel
scrapy crawl mySpider -a links=url
and each time with 1 url,
scrapy crawl mySpider -a links=url1
scrapy crawl mySpider -a links=url2
scrapy crawl mySpider -a links=url3
and when the first one finish his job, he will get the other url like
scrapy crawl mySpider -a links=url4
I read this question, and this one and I try this:
import threading
from threading import Thread
def func1(url):
scrapy crawl mySpider links=url
if __name__ == '__main__':
myArray = array(url1,url2,...,url90)
for(url in myArray):
Thread(target = func1(url)).start()
When you write target = func1(url)
you actually runnig func1
and passing result to Thread
(not a reference do the function). This means functions are run on the loop not in the seperate thread.
You need to rewrite it like that:
if __name__ == '__main__':
myArray = array(url1,url2,...,url90)
for(url in myArray):
Thread(target=func1, args=(url,))).start()
Then you are telling Thread to run func1
with arguments (url,)
Also you should wait for Threads to finish after the loop, otherwise your program with terminate just after starting all the threads.
EDIT: and if you want only 3 threads to be run on the same time you may want to use ThreadPool:
if __name__ == '__main__':
from multiprocessing.pool import ThreadPool
pool = ThreadPool(processes=3)
pool.map(func, myArray)