I am trying to create an iron.io worker using scrapy.
According to iron.io we need to place all the dependencies for the code in the worker itself.
I have created a folder called module
which will have all the 3rd party modules and installed scrapy via pip.
pip install scrapy -t module/
When trying to run scrapy via python module/scrapy/__init__.py
I am getting
Traceback (most recent call last):
File "module/scrapy/__init__.py", line 10, in <module>
__version__ = pkgutil.get_data(__package__, 'VERSION').decode('ascii').strip()
File "/usr/lib/python2.7/pkgutil.py", line 578, in get_data
loader = get_loader(package)
File "/usr/lib/python2.7/pkgutil.py", line 464, in get_loader
return find_loader(fullname)
File "/usr/lib/python2.7/pkgutil.py", line 474, in find_loader
for importer in iter_importers(fullname):
File "/usr/lib/python2.7/pkgutil.py", line 424, in iter_importers
if fullname.startswith('.'):
AttributeError: 'NoneType' object has no attribute 'startswith'
You'd probably be better off using Scrapy from your IronWorker code rather than calling it from the command line, just like it has on the front page of http://scrapy.org/ or in the tutorial: http://doc.scrapy.org/en/0.24/intro/tutorial.html
To use this in IronWorker, after you've done the pip install, be sure to add:
pip 'scrapy'
to your .worker file. Then in your worker script, you'd import it:
import scrapy
Then use it like it says in the tutorial link above.