When using the crawling framework Scrapy in Python, I want only to check the HTML response codes of a few thousand domains - and nothing else to do a fast and efficient initial crawling for status code.
How can I only do HEAD Requests instead of the default GET request?
you can use the method option in Request
def start_requests(self):
yield scrapy.Request(
url,
method="HEAD"
)