pythonpython-3.xcsv

Read csv with unquoted carriage return


I'm creating a csv file in python using the csv writer, i then try and read the file but because one of the values has a carriage return in it the csv reader doesn't parse the rows correctly, it sees the \r as a row separator, example code:

import csv

value = 'a string with \r in it'

with open('test_file', 'wt', newline='', encoding='utf-8-sig') as f:
    csv_writer = csv.writer(f, lineterminator='\n', escapechar='"')
    csv_writer.writerow([value])


with open('test_file', 'rt', newline='', encoding='utf-8-sig') as f:
    csv_reader = csv.reader(f, lineterminator='\n', escapechar='"')
    value_from_csv_file = next(csv_reader)[0].strip()

print(f'value: {repr(value)}')
print(f'value_from_csv_file: {repr(value_from_csv_file)}')
assert value_from_csv_file == value # this fails

Is there a way to make this work without using QUOTE_ALL? Is there a way to define in the reader to quote carriage returns with QUOTE_ALL? i don't understand why python creates a file it doesn't know how to read

The file created looks like this:

bash-4.2# hexdump -c test_file
0000000 357 273 277   a       s   t   r   i   n   g       w   i   t   h
0000010      \r       i   n       i   t  \n
0000019

there is 1 line created with 1 value that has in it a carriage return, later the reader interprets this as 2 lines even though the carriage return should be treated as part of the value

NOTE: i'm using python3.10


Solution

  • csv.reader ignore lineterminator

    https://docs.python.org/3/library/csv.html#csv.Dialect.lineterminator

    Note: The reader is hard-coded to recognise either '\r' or '\n' as end-of-line, and ignores lineterminator. This behavior may change in the future.