textx

How to parse keywords and strings from a line of text


Have a file keywords.tx with

Commands:
    keywords = 'this' & 'way'
;
StartWords:
    keywords = 'bag'
;

Then a file mygram.tx with

import keywords

MyModel:
    keyword*=StartWords[' ']
    name+=Word[' ']
;
Word:
    text=STRING
;

'''

My data file has one line with "bag hello soda this way". Would like to see result have attributes of keyword='bag' name='hello soda' and command='this way'.

Not sure how to get grammar to handle: keywords words keywords making sure that 2nd keywords are not included in the words. Another way to express is startwords words commands


Solution

  • If I understood your goal you can do something like this:

    from textx import metamodel_from_str
    
    mm = metamodel_from_str('''
    File:
        lines+=Line;
    
    Line:
        start=StartWord
        words+=Word
        command=Command;
    
    StartWord:
        'bag' | 'something';
    
    Command:
        'this way' | 'that way';
    
    Word:
        !Command ID;
    ''')
    
    input = '''
    bag hello soda this way
    bag hello soda that way
    something hello this foo this way
    '''
    
    model = mm.model_from_str(input)
    
    assert len(model.lines) == 3
    l = model.lines[1]
    assert l.start == 'bag'
    assert l.words == ['hello', 'soda']
    assert l.command == 'that way'
    
    

    There are several things to note:

    Command:
        'this ' 'way' | 'that ' 'way';
    

    which will match a single space as a part of this and than arbitrary number of whitespaces before way which will be thrown away.

    There is a comprehensive documentation with examples on the textX site so I suggest to take a look and go through some of the provided examples.