pythonpython-3.xweb-scrapingscrapy

How can proxy scrapy requests with Socks5?


Question:

How can proxy scrapy requests with socks5?

I know I can use polipo to convert Socks Proxy To Http Proxy

But:

I want to set a Middleware or some changes in scrapy.Request

import scrapy

class BaseSpider(scrapy.Spider):
    """a base class that implements major functionality for crawling application"""
    start_urls = ('https://google.com')

    def start_requests(self):

        proxies = {
            'http': 'socks5://127.0.0.1:1080',
            'https': 'socks5://127.0.0.1:1080'
        }

        for url in self.start_urls:
            yield scrapy.Request(
                url=url,
                callback=self.parse,
                meta={'proxy': proxies} # proxy should be string not dict
            )

    def parse(self, response):
        # do ...
        pass

what should I assign to proxies variable?


Solution

  • It is currently not possible. There is a feature request for it.