pythontableofcontents

How to apply hierarchical numbering to indented titles?


I have a table of content in the form of indention to track the hierarchy like:

- title1
-- title1-1
-- title1-2
--- title1-2-1
--- title1-2-2
- title2
-- title2-1
-- title2-2
- title3
- title4

I want to translate them with a numbering format like:

1 title1
1.1 title1-1
1.2 title1-2
1.2.1 title1-2-1
1.2.2 title1-2-2
2 title2
2.1 title2-1
2.2 title2-2
3 title3
4 title4

This is just an example where the string "title-*" could be any heading text. Also the size of an indent could get greater than in this example.

This comes from my real work, where I collect headings, or manually hand-written headings, in a Word document and reformat these possible headings from beginning to end aiming to correct any wrong order and indention.

I have tried this myself, and while mostly these headings were transformed into the desired format, for some it did not work out. How should this be done?


Solution

  • You could use the replacer callback of re.sub to implement the logic. In that callback use a stack (that is maintained across multiple replacements) to track the chapter numbers of upper "levels".

    Code:

    import re
    
    def add_numbers(s):
        stack = [0]
        
        def replacer(s):
            indent = len(s.group(0)) - 1
            del stack[indent+1:]
            if indent >= len(stack):
                stack.append(0)
            stack[indent] += 1
            return ".".join(map(str,stack))
            
    
        return re.sub(r"^-+", replacer, s, flags=re.M)
    

    Here is how you would call it on your example:

    message_string = """- title1
    -- title1-1
    -- title1-2
    --- title1-2-1
    --- title1-2-2
    - title2
    -- title2-1
    -- title2-2
    - title3
    - title4"""
    
    res = add_numbers(message_string)
    print(res)
    

    This prints:

    1 title1
    1.1 title1-1
    1.2 title1-2
    1.2.1 title1-2-1
    1.2.2 title1-2-2
    2 title2
    2.1 title2-1
    2.2 title2-2
    3 title3
    4 title4