pythonregexconsole-applicationvt100

regex to replace console code with whitespaces


I'm writing some Python tests for a console application that uses console codes, and I'm having some problem gracefully handling the ESC H sequence.

I have the s = r'\x1b[12;5H\nSomething' input string, I'd like to replace it with Something. I'm trying to use the following regex:

re.sub(r'\x1b\[([0-9,A-Z]{1,2};([0-9]{1,2})H)', r'\2', s)

Which of course creates 5Something.

What I want is something to the effect of

re.sub(r'\x1b\[([0-9,A-Z]{1,2};([0-9]{1,2})H)', ' '*(int(r'\2')-1), s)

Which is to create one less than the number of spaces of the second capture group.

I'd also be very happy if there was a way to simply render in a string what I get when I use print(s):

    Something

I'm using Python 3.

Thanks a lot!!


Solution

  • Use

    import re
    s = r'\x1b[12;5H\nSomething'
    pattern = r'\\x1b\[[0-9A-Z]{1,2};([0-9]{1,2})H\\n'
    print(re.sub(pattern, lambda x: ' '*(int(x.group(1))-1), s))
    

    See Python proof. See a regex proof.

    EXPLANATION

    --------------------------------------------------------------------------------
      \\                       '\'
    --------------------------------------------------------------------------------
      x1b                      'x1b'
    --------------------------------------------------------------------------------
      \[                       '['
    --------------------------------------------------------------------------------
      [0-9A-Z]{1,2}            any character of: '0' to '9', 'A' to 'Z'
                               (between 1 and 2 times (matching the most
                               amount possible))
    --------------------------------------------------------------------------------
      ;                        ';'
    --------------------------------------------------------------------------------
      (                        group and capture to \1:
    --------------------------------------------------------------------------------
        [0-9]{1,2}               any character of: '0' to '9' (between 1
                                 and 2 times (matching the most amount
                                 possible))
    --------------------------------------------------------------------------------
      )                        end of \1
    --------------------------------------------------------------------------------
      H                        'H'
    --------------------------------------------------------------------------------
      \\                       '\'
    --------------------------------------------------------------------------------
      n                        'n'