I'd like to define an unordered set of values using an Extended Backus-Naur Form (EBNF) context free grammar. It's easy to define an unordered list of values in EBNF, for example:
value = 'A' | 'B' | 'C';
list = value, {',', value};
However, I'm having doubts it can be done for an unordered set.
Below are examples of valid unordered sets of values:
A, B, C, D
A, B, D, C
A, D, C, B
...
D, C, B, A
Whilst invalid lists would be:
A, A, C, D
B, C, C, B
A, A, A, A
...
or lists of arbitrary length.
A, A, B, C, D, A
A, B, C, D, A, B, C, D
...
You can define unordered sets in EBNF, but only by listing all possible enumerations. That's impractical for sets larger than about two elements.
The standard -- to the extent that EBNF is a standardized notation -- allows you to use English (or any other language you feel comfortable with) to describe a sequence which is not otherwise describable. For example, the EBNF for EBNF includes this production:
syntactic exception
= ? a syntactic-factor that could be replaced
by a syntactic-factor containing no
meta-identifiers
? ;
Similarly, you could write something like:
value = 'A' | 'B' | 'C';
list = value, {',', value};
set = ? a "list" where all "value"s are unique ? ;
or perhaps
set = ? a "list" with exactly three values
where all "value"s are unique
? ;
That's not much use for building a parser generator, but it might be helpful for a human reader who understands the same language as you do.
For real parser generators, a common strategy is to allow any list, which might include repeated elements, and then reject invalid lists during semantic analysis. (Or even earlier. It's not a difficult analysis.)