The goals of this script are simple:
read in a .csv
file
strip out instances of the escape character &
and replace it with &
eliminate all rows that don't satisfy the following criteria:
validate the lines to ensure that they have no more or less columnar values than necessary
if possessed on a blank/null/whitespace/empty column- eliminate that row
The code looks like this:
import csv
num_headers = 9
starts = 1
def url_escaper(data):
for line in data:
yield line.replace('&','&')
with open("adzuna_input.csv", 'r') as file_in, open("adzuna_output.csv", 'w') as file_out:
csv_in = csv.reader(url_escaper(file_in))
csv_out = csv.writer(file_out)
for i, row in enumerate(csv_in, starts):
counter = 1
if len(row) == num_headers:
for element in row:
if element.strip():
counter += 1
if counter == num_headers:
csv_out.writerow(row)
else:
print "line %d is malformed" % i
Earlier, I had it working but this last condition, i.e. if possessed on a blank/null/whitespace/empty column- eliminate that row is giving me trouble, I don't know what to do about it.
My solution was:
for i, row in enumerate(csv_in, starts):
counter = 1
if len(row) == num_headers:
for element in row:
if element.strip():
counter += 1
counting the rows, looking at the values, trying to strip
them as a way of accessing whether or not this field has some useful information, i.e. a string/int/some text, in it.
However this is not working.
The exact error message I'm getting is about the indentation of csv_out.writerow(row)
, but I suspect that is just a pretext.
Exact message
File validator.py,
line 23 csv_out.writerow(row)
^
IndentationError: expected an indented block
I would like to know why the above program does not execute.
You actually need to format your code properly:
for element in row:
if element.strip():
counter += 1
if counter == num_headers:
csv_out.writerow(row)
The line with csv_out.writerow
is indented with 8 spaces, so whether if is underindented
or csv_out.writerow
is overindented.