javascriptpythonregexgedit

Python script to match actual start of line ignoring tabs and spaces


I think my question is pretty much self-explanatory, but still I would post an example for more clarity.

I have following fully working script to comment/uncomment the lines in a Javascript file opened in the Gedit editor.

#! /usr/bin/env python
import sys
import StringIO
block = sys.stdin.read()
block = StringIO.StringIO(block)
msg = ''
for line in block:
    if "//~" in line:
        line = line.replace('//~','')
        msg = "All lines in selection uncommented"
    else:
        line = "//~" + line
        msg = "All lines in selection commented"
    sys.stdout.write(line)
exit(msg)

Now I want to put //~ in front of the actual start of line (not spaces or tabs but when really line starts i.e. characters and strings).

If I do this with regex module like below then it adds //~ twice, means to both start of line and actual start of line.

#! /usr/bin/env python
import sys
import StringIO
import re
block = sys.stdin.read()
block = StringIO.StringIO(block)
msg = ''
for line in block:
    if "//~" in line:
        line = re.sub(r"(\s*)(\S.*)", r"\1//~\2", line)
        line = line.replace('//~','')
        msg = "All lines in selection uncommented"
    else:
        line = re.sub(r"(\s*)(\S.*)", r"\1//~\2", line)
        line = "//~" + line
        msg = "All lines in selection commented"
    sys.stdout.write(line)
exit(msg)

How can I do that with/without Regex in python ?


Solution

  • You can use regex replacements to do this. For example, this line of code should do what you want

    line = re.sub(r"^(\s*)(\S.*)", r"\1//~\2", line)
    

    This regex matches 0 or more space characters [(\s*)], then matches the rest of the string [(\S.*)]. It then replaces this with the 1st capturing group [\1], the spaces, then the two slashes [//~], then the rest of the string [\2].