regexpython-3.xspintax

Python 3 - Spinning text


I'm stucked on doing something like this..

from this

{Hi|Hello} I am - {Me|You|Us}

to this

#Possible results
'Hi I am - You'
'Hello I am - Me'
'Hi I am - Us'
'Hello I am - You'

So basically, the code will search for words which are enclosed in curly braces {}. Those curly braces have multiple words inside them which would be splitted. All in all, every curly braces will output only word, which it will choose randomly.

Do I need regex for this? I tried searching for premade libraries, but I just found an outdated one..can anyone help please?


Solution

  • If your input is relatively simple -- the only occurrences of { and } are for the purposes of providing multiple possible text fragments as shown in the question -- you could use a regex like the following:

    import re
    
    p = re.compile('(\{[^\}]+\}|[^\{\}]*)')
    

    Then you'd split the text into fragments like so:

    frags = p.split("{Foo|Bar} baz {quux|wibble}.")
    # ['', '{Foo|Bar}', '', ' baz ', '', '{quux|wibble}', '', '.', '']
    

    For each string in this list, you can generate a list of possible values (only one for the strings not starting with {):

    def options(s):
        if len(s) > 0 and s[0] == '{':
            return [opt for opt in s[1:-1].split('|')]
        return [s]
    
    options("foo")
    # ["foo"]
    
    options("{foo|bar}")
    # ["foo", "bar"]
    

    Then build a list of lists of options:

    opt_lists = [options(frag) for frag in frags]
    

    Then build the Cartesian product and join:

    import itertools
    
    for spec in itertools.product(*opt_lists):
        print(''.join(spec))
    

    Here's the output for the "{Foo|Bar} baz {quux|wibble}." example:

    Foo baz quux.
    Foo baz wibble.
    Bar baz quux.
    Bar baz wibble.
    

    If there are additional complexities in your inputs, you might need to use more complex regular expressions or a parser for the actual input format, but the general idea of producing a list of lists of options as an intermediate result remains valid.