parsingcontext-free-grammarebnfcontext-free-languageebnf-syntactic-exception

Can extended Backus Naur Form (EBNF) describe an unordered set of values?


I'd like to define an unordered set of values using an Extended Backus-Naur Form (EBNF) context free grammar. It's easy to define an unordered list of values in EBNF, for example:

value = 'A' | 'B' | 'C';
list = value, {',', value};

However, I'm having doubts it can be done for an unordered set.

Below are examples of valid unordered sets of values:

A, B, C, D
A, B, D, C
A, D, C, B
...
D, C, B, A

Whilst invalid lists would be:

A, A, C, D
B, C, C, B
A, A, A, A
...

or lists of arbitrary length.

A, A, B, C, D, A
A, B, C, D, A, B, C, D
...

Solution

  • You can define unordered sets in EBNF, but only by listing all possible enumerations. That's impractical for sets larger than about two elements.

    The standard -- to the extent that EBNF is a standardized notation -- allows you to use English (or any other language you feel comfortable with) to describe a sequence which is not otherwise describable. For example, the EBNF for EBNF includes this production:

    syntactic exception
    = ? a syntactic-factor that could be replaced 
        by a syntactic-factor containing no
        meta-identifiers
      ? ;
    

    Similarly, you could write something like:

    value = 'A' | 'B' | 'C';
    list = value, {',', value};
    set = ? a "list" where all "value"s are unique ? ;
    

    or perhaps

    set = ? a "list" with exactly three values
            where all "value"s are unique
          ? ;
    

    That's not much use for building a parser generator, but it might be helpful for a human reader who understands the same language as you do.

    For real parser generators, a common strategy is to allow any list, which might include repeated elements, and then reject invalid lists during semantic analysis. (Or even earlier. It's not a difficult analysis.)