pythoncsvscrapy

Modifiying CSV export in scrapy


I seem to be missing something very simple. All i want to do is use ; as a delimiter in the CSV exporter instead of ,.

I know the CSV exporter passes kwargs to csv writer, but i cant seem to figure out how to pass this the delimiter.

I am calling my spider like so:

scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv 

Solution

  • In contrib/feedexport.py,

    class FeedExporter(object):
    
        ...
    
        def open_spider(self, spider):
            file = TemporaryFile(prefix='feed-')
            exp = self._get_exporter(file)  # <-- this is where the exporter is instantiated
            exp.start_exporting()
            self.slots[spider] = SpiderSlot(file, exp)
    
        def _get_exporter(self, *a, **kw):
            return self.exporters[self.format](*a, **kw)  # <-- not passed in :(
    

    You will need to make your own, here's an example:

    from scrapy.conf import settings
    from scrapy.contrib.exporter import CsvItemExporter
    
    
    class CsvOptionRespectingItemExporter(CsvItemExporter):
    
        def __init__(self, *args, **kwargs):
            delimiter = settings.get('CSV_DELIMITER', ',')
            kwargs['delimiter'] = delimiter
            super(CsvOptionRespectingItemExporter, self).__init__(*args, **kwargs)
    

    In the settings.py file of your crawler directory, add this:

    FEED_EXPORTERS = {
        'csv': 'importable.path.to.CsvOptionRespectingItemExporter',
    }
    

    Now, you can execute your spider as follows:

    scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv --set CSV_DELIMITER=';'