parsingprologdsldcgebnf

Difficulties implementing DSL in Prolog from EBNF using DCG


I'm working on implementation of the Google Protobuf compiler for proto files in Prolog for generating Prolog programs. Prolog is SWI-Prolog.

I'm translating EBNF definitions into DCG and ran across a few problems:

  1. I have to deal with [ ... ] and { ... } EBNF construct - meaning optional ( executable zero or one times ) and repeatative( executable any number of times );

  2. I have to insert the callbacks into DCG code to implement the part of compiler functionality (syntax switching/importing/ etc.) using DCG's construct { ... }, which allows goals in Prolog syntax inside DCG rules.

I'm applying for optional and repeatative the meta-predicates: $$rep/1, $$opt/1:

EBNF
decimals  = decimalDigit { decimalDigit }
exponent  = ( "e" | "E" ) [ "+" | "-" ] decimals 

DCG
decimals  --> decimalDigit, '$$rep'( decimalDigit ).
exponent  --> ( "e"; "E" ), '$$opt'( "+"; "-" ), decimals.

'$$rep'( Goal ) :- repeat, call(Goal); !, fail.

'$$opt'( Goal ) :- once(Goal) ; \+ Goal.

"Callback:"
import --> "import", opt(( "weak" ; "public", { record(public)} )), strLit,
{
     import(public, strlit )
}, ";".

Looking awkward (if not to say ugly) for me...

Questions:

What's wrong with my solutions?

Should I manually translate EBNG into DCG without using meta-predicates?

What is the alternative for the awkward penetration into a DCG rule?


Solution

  • From a quick first glance, the main issue is that you are uncleanly intermingling DCGs with regular Prolog predicates.

    Stay within DCGs to define all nonterminals. For example:

    optional(NT) --> [] | NT.
    
    once_or_more(NT) --> NT, or_more(NT).
    
    or_more(NT) --> [] | NT, or_more(NT).
    

    With the following example definition:

    a --> [a].
    

    We can post:

    ?- phrase(optional(a), Ls).
    Ls = [] ;
    Ls = [a].
    
    ?- phrase(once_or_more(a), Ls).
    Ls = [a] ;
    Ls = [a, a] ;
    Ls = [a, a, a] ;
    Ls = [a, a, a, a] ;
    Ls = [a, a, a, a, a] .
    

    This seems to work as you need it.

    For the callback, you can simply pass around the predicate that you need to call, with the general outline:

    parse_with_callback(Goal) -->
            ...,
            { Goal },
            ...
    

    This seems quite OK.

    If such patterns arise frequently, you can always consider generating such DCGs from a different representation that lets you represent the task more cleanly.