I'm creating a csv file in python using the csv writer, i then try and read the file but because one of the values has a carriage return in it the csv reader doesn't parse the rows correctly, it sees the \r as a row separator, example code:
import csv
value = 'a string with \r in it'
with open('test_file', 'wt', newline='', encoding='utf-8-sig') as f:
csv_writer = csv.writer(f, lineterminator='\n', escapechar='"')
csv_writer.writerow([value])
with open('test_file', 'rt', newline='', encoding='utf-8-sig') as f:
csv_reader = csv.reader(f, lineterminator='\n', escapechar='"')
value_from_csv_file = next(csv_reader)[0].strip()
print(f'value: {repr(value)}')
print(f'value_from_csv_file: {repr(value_from_csv_file)}')
assert value_from_csv_file == value # this fails
Is there a way to make this work without using QUOTE_ALL? Is there a way to define in the reader to quote carriage returns with QUOTE_ALL? i don't understand why python creates a file it doesn't know how to read
The file created looks like this:
bash-4.2# hexdump -c test_file
0000000 357 273 277 a s t r i n g w i t h
0000010 \r i n i t \n
0000019
there is 1 line created with 1 value that has in it a carriage return, later the reader interprets this as 2 lines even though the carriage return should be treated as part of the value
NOTE: i'm using python3.10
csv.reader
ignore lineterminatorhttps://docs.python.org/3/library/csv.html#csv.Dialect.lineterminator
Note: The reader is hard-coded to recognise either '\r' or '\n' as end-of-line, and ignores lineterminator. This behavior may change in the future.