Is the alternative operator (/
) in Augmented Backus-Naur Form commutative?
For example, is s = a / b
the same as s = b / a
?
I haven't found any primary sources on BNF or ABNF which explicitly specify /
semantics when both sides would yield valid matches. They don't allude to context-free grammars and their allowance for non-determinism either. If anyone knows of clarifying references please share.
EDIT: Tony's answer points out RFC 3501 from 2003 specifies the semantics of ABNF alternation, at least as it's used in that document.
The introduction contrasts BNF and ABNF (with emphasis added here):
Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications. It balances compactness and simplicity with reasonable representational power. In the early days of the Arpanet, each specification contained its own definition of ABNF. This included the email specifications, RFC733 and then RFC822 , which came to be the common citations for defining ABNF. The current document separates those definitions to permit selective reference.
The differences between standard BNF and ABNF involve naming rules, repetition, alternatives, order-independence, and value ranges.
"Selective reference" and "order-independence" may relate to alternation ordering semantics, but it's unclear.
Unless I'm missing something, the cited RFCs don't specify /
semantics either. Section 2.2 evades the problem.
2.2. RULE1 / RULE2: ALTERNATIVES
Elements separated by slash ("/") are alternatives. There- fore "foo / bar" will accept foo or bar.
Various rule definitions show they recognize the practical importance of avoiding ambiguity. For example, here's how RFC 822 defines optional-field
and its dependencies:
optional-field =
/ "Message-ID" ":" msg-id
/ "Resent-Message-ID" ":" msg-id
/ "In-Reply-To" ":" *(phrase / msg-id)
/ "References" ":" *(phrase / msg-id)
/ "Keywords" ":" #phrase
/ "Subject" ":" *text
/ "Comments" ":" *text
/ "Encrypted" ":" 1#2word
/ extension-field ; To be defined
/ user-defined-field ; May be pre-empted
extension-field =
<Any field which is defined in a document
published as a formal extension to this
specification; none will have names beginning
with the string "X-">
user-defined-field =
<Any field which has not been defined
in this specification or published as an
extension to this specification; names for
such fields must be unique and may be
pre-empted by published extensions>
BNF comes from IAL notation. The paper introduces ̅o̅r "metalinguistic connective", which is intuitively related to /
. However, it also dodges the ambiguous choice problem and presumably just uses it carefully.
Due to unspecified semantics my suggestion is to treat every possible match in an alternation
rule as valid. When the grammar isn't carefully designed to avoid ambiguity this interpretation can result in multiple valid parse trees for the same input. Addressing ambiguous parses as they occur is safer than forging ahead with an unintentionally valid parse tree.
Alternatively, if you have influence over how the grammar is specified you could consider a notation with clearer semantics. For example, Parsing Expression Grammar: A Recognition-Based Syntactic Foundation (Ford 2004) gives alternatives deterministic prioritized choice semantics (left-most match wins).