pythontextpython-3.11

How to add a new line based on conditions in text file using Python?


I have text in an Input File

I need to start a new line in the text file every time I find the string 'NODATACODE' and write it to an Output file.

enter image description here

The desired output is below with a new line added everytime the 'NODATACODE' output is found:

enter image description here

I tried the following code to perform the task:

     with open('InputFile.txt', 'r') as file:
                  data = file.read()

     nodata_index = data.find('NODATACODE')

     if nodata_index != -1:
         data_to_write = data[nodata_index:]
     #Code to add a new line
         data_to_write = str(data_to_write.split('\n'))

         with open('Output.txt', 'w') as file:
              file.write(data_to_write)
     else:
         print("'NODATACODE' not found in the file.")

I don't get any error messages but I do get wrong output. My incorrect error output is below.

enter image description here

Please let me know what I need to amend in my code.

Thanks a lot in advance.


Solution

  • The issue in your code arises from how you're handling the string data_to_write. When you split the data using data_to_write.split('\n'), it creates a list of strings. Then, when you convert this list back to a string using str(...), it formats the list as a string with square brackets and commas, which is not what you want.

    The idea is to search for occurrences of 'NODATACODE' that are not at the start of the text and replace them with a newline followed by 'NODATACODE'. This can be efficiently done using a regular expression that matches 'NODATACODE' and checks if it's not preceded by the start of the string. Here's how you can modify your code:

    import re
    
    # Read the file content first
    with open('InputFile.txt', 'r') as file:
        data = file.read()
    
    # Find the index where 'NODATACODE' occurs
    nodata_index = data.find('NODATACODE')
    
    # Check if 'NODATACODE' is found
    if nodata_index != -1:
        # Extract the text from 'NODATACODE' to the end
        data_to_write = data[nodata_index:]
        data_to_write = re.sub(r'(?<!^)NODATACODE', r'\nNODATACODE', data_to_write)
        # Write the modified data back to the file
        with open('Output.txt', 'w') as file:
            file.write(data_to_write)
    else:
        print("'NODATACODE' not found in the file.")
    

    In this script: