lexical-analysisgold-parser

Defining length of a String/input in Gold Parser Builder


I'm a newbie to use Gold Parser Engine and looking for a way to limit the length of a defined string but I'm not finding any way to do this. Please help de do that/. Here is my code

    ! Welcome to GOLD Parser Builder 
"Case Sensitive"='false'
"Start Symbol" =<start>
{String Char}={Printable}
    <start>::=<Value>
        !<Value>::=<>|a<Value>|b<Value>
<Value>::=<type>name<equal>number<symbol>|<type>name<symbol>
        <type>::=int|float|char|double|boolean|String
            name={Letter}{alphanumeric}+
                <symbol>::=';'
                         <equal>::='='
number={Digit}+[.]{Digit}+|{Digit}+|{Letter} 

Is there any way I could explain the max limit for a string. Thanks


Solution

  • Sounds like the parser wasn't designed to easily handle lexeme sizes with regular expressions. You should so check the String size in the skeleton program generated from your Grammar.

    To illustrate, I tried this very trivial grammar example from the official website:

    "Name"         = 'String Terminal Example'
    "Author"       = 'Devin Cook'
    "About"        = 'This is a simple example which defines a basic string'
    
    "Start Symbol" = <Value>
    
    ! The following defines a set of characters for the string. It contains 
    ! all printable characters with the exceptionof the double-quotes used 
    ! for delimiters. The {Printable} set does not contain newlines.
    
    {String Ch} = {Printable} - ["]
    
    ! This statement defines the string symbol
    
    String     = '"' {String Ch}* '"'
    
    <Value>   ::= String
    

    String is both as a terminal token (String = '"' {String Ch}* '"') or in a rule (<Value> ::= String). You can check the token size at the terminal level.

    I generated a C# one through Calitha Engine - Custom Parser class template, and I got a parser. Below I found the part in which you should check your String terminal token:

    // [...]
    private Object CreateObjectFromTerminal(TerminalToken token)
    {
      switch (token.Symbol.Id)
        {
          // [...]
    
        case (int)SymbolConstants.SYMBOL_STRING :
          //String
          //todo: Create a new object that corresponds to the symbol
          return null;
    
          // [...]
    
        }
      throw new SymbolException("Unknown symbol");
    }
    

    According to the Calitha Parser Engine documentation, it's possible to retrieve the text from the token: TerminalToken.Text. So why not proceeding as below:

    case (int)SymbolConstants.SYMBOL_STRING :
        // Check size (MAX_LENGTH could be a constant you defined)
        if (token.Text.Length > MAX_LENGTH)
        {
            // handle error here
            throw new SymbolException("String too long");
        }
        return token.Text;