I have a text file:
title
header
topic one two three
hello harry
i want to remove only single lines between text to get:
title
header
topic one two three
hello harry
how i can do this using python?
data = open('data.txt').read().replace('\n', '')
the above removes all
Use a regular expression to match all instances of \n\n
exactly and replace them with single \n
. You must match \n\n
because each line in your example input file will end in \n
(so a blank line between paragraphs is \n\n
).
data = open('data.txt').read()
pattern = r'(?<!\n)\n\n(?!\n)'
re.sub(pattern, '\n', data)
The first part (?<!\n)
checks that the preceding character is not a newline.
The middle \n\n
checks for a double newline.
The end part (?!\n)
checks that the following character is not a newline.
So this regex solution is generalized and will match all instances of \n\n
without touching \n
or \n\n\n
etc.