I have a script reading in a csv file with very huge fields:
# example from http://docs.python.org/3.3/library/csv.html?highlight=csv%20dictreader#examples
import csv
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
print(row)
However, this throws the following error on some csv files:
_csv.Error: field larger than field limit (131072)
How can I analyze csv files with huge fields? Skipping the lines with huge fields is not an option as the data needs to be analyzed in subsequent steps.
The csv file might contain very huge fields, therefore increase the field_size_limit
:
import sys
import csv
csv.field_size_limit(sys.maxsize)
sys.maxsize
works for Python 2.x and 3.x. sys.maxint
would only work with Python 2.x (SO: what-is-sys-maxint-in-python-3)
As Geoff pointed out, the code above might result in the following error: OverflowError: Python int too large to convert to C long
.
To circumvent this, you could use the following quick and dirty code (which should work on every system with Python 2 and Python 3):
import sys
import csv
maxInt = sys.maxsize
while True:
# decrease the maxInt value by factor 10
# as long as the OverflowError occurs.
try:
csv.field_size_limit(maxInt)
break
except OverflowError:
maxInt = int(maxInt/10)