I set the default user-agent in settings.py
, but I still had to go to the trouble of adding the -s
option and the corresponding value to set the user_agent every time I used the scrapy shell
.
I know I can use commands like alias scrapys="scrapy shell -s USER_AGENT='xxxxx'"
to do it, but is there any better way to implement it?
Setting USER_AGENT
in settings.py
should suffice your need. If you have problem with this way, please provide more info (like print you project structure with tree
command.).
To make settings.py
being read by scrapy shell ...
command, make sure
You're running the command in the project root, where you can see a scrapy.cfg
file.
settings.py
module path is defined in the scrapy.cfg
.
[settings]
default = project_name.settings
project_name.settings
is the module path to settings.py
.
Use spider class attribute Spider.custom_settings
.
class MySpider(scrapy.Spider):
name = 'myspider'
custom_settings = {
'USER_AGENT': 'some value',
}
This spider specific setting dict .custom_settings
overrule values the global settings.py
.