Are there any ways to compress logs? I need to store them for some time for later debugging and it would be cool to find a way to reduce their size. If there is no such a method, then how to organize the compression process more efficiently?
You can compress the logs after the spider has finished running by writing the compression code in the spider closed
method. See sample below where I compress the log file and then after compression I delete the initial log file. You can improve the code by adding some error handling.
import scrapy
import gzip
import os
class TestSpider(scrapy.Spider):
name = 'test'
allowed_domains = ['toscrape.com']
start_urls = ['https://books.toscrape.com']
custom_settings = {
'LOG_FILE': 'scrapy.log'
}
def parse(self, response):
yield {
'url': response.url
}
def closed(self, reason):
with open('scrapy.log', 'rb') as f_in, gzip.open('scrapy.log.gz', 'wb') as f_out:
f_out.writelines(f_in)
os.remove('scrapy.log')