pythonexcelexcel-2016cell-formatting

How to remove numbers from text strings in excel cells


I have an Excel workbook with header cells such as the following:

  1. birthday 645 (before 1995)
  2. birthday (after 1995) 99

...

...

I want to remove every number in those cells except for the ones that are needed to be there. In the example above, except for 1995, any other numbers in those cells should be removed.

Is there a way that I can do such a thing? such as defining a function in Python?


Solution

  • Per the comments above, assuming well defined parentheses, you can use this function to remove all numbers that are not nested in parentheses.

    def remove_nums(string):
        depth = 0
        res = []
        for char in string:
            if char.isdigit() and depth == 0:
                continue
            elif char == '(':
                depth += 1
            elif char == ')':
                depth -= 1
            res.append(char)
            
        string = ''.join(res)
            
        # remove double spaces
        while '  ' in string:
            string = string.replace('  ', ' ')
            
        return string
    
    if __name__ == "__main__":
        strings = ["birthday 645 (before 1995)", "birthday (after 1995) 99"]
        for string in strings:
            print(string+':',remove_nums(string))
    
    In:  "birthday 645 (before 1995)"
    Out: "birthday (before 1995)"
    In:  "birthday (after 1995) 99"
    Out: "birthday (after 1995)"