pythoncsvwith-statementreadlineswritelines

Function failing to update spacing after comma


I have a csv file that has inconsistent spacing after the comma, like this:

534323, 93495443,34234234, 3523423423, 2342342,236555, 6564354344

I have written a function that tries to read in the file and makes the spacing consistent, but it doesn't appear to update anything. After opening the new file created, there is no difference from the original. The function I've written is:

def ensure_consistent_spacing_in_csv(dirpath, original_name, new_name):
    with open(dirpath + original_name, "r") as f:
        data = f.readlines()
    for item in data:
        if "," in data:
            comma_index = item.index(",")
            if item[comma_index + 1] != " ":
                item = item.replace(",", ", ")
    with open(dirpath + new_name, "w") as f:
        f.writelines(data)

Where am I going wrong?

I have looked at the answer to the question here, but I cannot use that method as I need the delimiter to be ", ", which is two characters and hence not allowed. I also tried to follow the method in the sed answer to the question here using a process.call system, but that also failed and I don't know bash well so I'm hesitant to go that route and would like to use a pure python method.

Thank you!


Solution

  • The original code has a couple bugs:

    A simple solution that doesn't require regular expressions or sed or indexing and looking at each word character by character is:

    with open(dirpath + orig_filename, "r") as f:
        for line in f:
            new_line = line.replace(" ", "").replace(",", ", ")
            with open(dirpath + cleaned_filename, "a") as cleaned_data:
                cleaned_data.writelines(new_line)
    

    What this is doing is:

    1. for line in f reads each line of the file.
    2. line.replace(" ", "").replace(",", ", ")) first removes all spaces entirely (thanks to @megakarg for the suggestion) from the line, and then makes sure there's a single space after each comma to meet the spec.