In a python script, I need to detect the endline terminator of different csv files. These endline terminators could be: '\r' (mac), '\r\n' (windows), '\n' (unix).
I tried with:
dialecto = csv.Sniffer().sniff(csvfile.read(2048), delimiters=",;")
dialecto.lineterminator
But it doesn't work.
How I could do that?
EDIT:
Based on abarnert response:
def getLineterminator(file):
with open(file, 'rU') as csvfile:
csvfile.next()
return csvfile.newlines
You can't use the csv
module to auto-detect line terminators this way. The Sniffer
that you're using is designed to guess between CSV dialects for use by csv.Reader
. But, as the docs say, csv.Reader
actually ignores lineterminator
and handles line endings interchangeably, so Sniffer
doesn't have any reason to set it.
But really, a CSV file with a XXX line terminators is just a text file with XXX line terminators. The fact that it's CSV is irrelevant. Just open
the file in text mode, read a line out of it, and check its newlines
property:
next(file)
file.newlines
In Python 3, as long as you opened the file in text mode (don't use a 'b'
in the mode), this will work. In Python 2.x, you may need to specify universal newlines mode (don't use a 'b'
, and also do use a 'U'
). If you're writing code for both versions, you can use universal newlines mode, and it'll just be ignored in 3.x—but don't do that unless you need it, since it's deprecated as of 3.6 and may become an error one day.