How to cleanup some content from the text file

I have the following data in a CSV.

"ID","OTHER_FIELDS_2"
"87","25 R160  22  13  E"
"87","25 R165  22  08  E"
"77",""
"18","20 BA-10  12  06  2  30  S"
"18","20 BA-20  12  06  2  30  S"
"88","20 TH-42  02  02  5  30  MT"
"66","20 AD-38  12  06  B"
"66","20 AD-38  30  07  B"
"70","20 OL-45  19  11  B"
"70","20 EM-45  19  08  B"

After running my Python code, I got the following output:

18,"
","20 BA-10  12  06  2  30  S
20 BA-20  12  06  2  30  S","
",**********
66,"
","20 AD-38  12  06  B
20 AD-38  30  07  B","
",**********
70,"
","20 OL-45  19  11  B
20 EM-45  19  08  B","
",**********
77,"
",,"
",**********
87,"
","25 R160  22  13  E
25 R165  22  08  E","
",**********
88,"
",20 TH-42  02  02  5  30  MT,"
",**********

But I have to generate the following output in txt format:

18
20 BA-10  12  06  2  30  S
20 BA-20  12  06  2  30  S
**********
66
20 AD-38  12  06  B
20 AD-38  30  07  B
**********
70
20 OL-45  19  11  B
20 EM-45  19  08  B
**********
77
**********
87
25 R160  22  13  E
25 R165  22  08  E
**********
88
20 TH-42  02  02  5  30  MT
**********

Here is my code:

import pandas as pd
import csv

df = pd.read_csv('idDetails.csv')
data_rows = [] 

group_column = 'ID'
selected_columns = ['OTHER_FIELDS_2'] 
grouped_data = df.groupby(group_column)[selected_columns]

for group_name, group_df in grouped_data:
    #print(f"{group_name}")
    other_data = group_df.to_string(header=False,index=False)
    other_data_a = group_df.fillna('').dropna(axis = 0, how = 'all') 
    other_data_b = other_data_a.to_string(header=False,index=False)
    #print(other_data_b) 
    other_data_c = '**********'
    #print(other_data_c)
    data_rows.append([group_name, '\n', other_data_b, '\n', other_data_c])  
dfo = pd.DataFrame(data_rows)
dfo.to_csv('idDetailsoutput.txt', header=False, index=False)

A possible solution to get the desired output is appreciated.

Solution

The problem with your current code is that you're putting each group into a row of a DataFrame and then saving it with to_csv(). That adds extra quotes and weird line breaks in your output. Instead, you should just write everything directly to a text file using normal file writing — no need to use pandas.to_csv for this.


import pandas as pd

# Read the CSV
df = pd.read_csv('idDetails.csv')

# Group by ID
grouped = df.groupby('ID')['OTHER_FIELDS_2']

# Open the file in write mode
with open('idDetailsoutput.txt', 'w') as f:
    for group_id, items in grouped:
        f.write(f"{group_id}\n")  # Write the ID
        for item in items:
            if pd.notna(item) and item.strip() != "":
                f.write(f"{item.strip()}\n")  # Write each non-empty value
        f.write("**********\n")  # Separator line

Output:

18
20 BA-10  12  06  2  30  S
20 BA-20  12  06  2  30  S
**********
66
20 AD-38  12  06  B
20 AD-38  30  07  B
**********
70
20 OL-45  19  11  B
20 EM-45  19  08  B
**********
77
**********
87
25 R160  22  13  E
25 R165  22  08  E
**********
88
20 TH-42  02  02  5  30  MT
**********