Issue
I set the default user-agent in settings.py
, but I still had to go to the trouble of adding the -s
option and the corresponding value to set the user_agent every time I used the scrapy shell
.
I know I can use commands like alias scrapys="scrapy shell -s USER_AGENT='xxxxx'"
to do it, but is there any better way to implement it?
Solution
Solution 1
Setting USER_AGENT
in settings.py
should suffice your need. If you have problem with this way, please provide more info (like print you project structure with tree
command.).
To make settings.py
being read by scrapy shell ...
command, make sure
You're running the command in the project root, where you can see a
scrapy.cfg
file.settings.py
module path is defined in thescrapy.cfg
.[settings] default = project_name.settings
project_name.settings
is the module path tosettings.py
.
Solution 2
Use spider class attribute Spider.custom_settings
.
class MySpider(scrapy.Spider):
name = 'myspider'
custom_settings = {
'USER_AGENT': 'some value',
}
This spider specific setting dict .custom_settings
overrule values the global settings.py
.
Ref
Answered By - Simba
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.