parsingprolog

Prefix Parsing using Prolog


So I have this grammar:

  expr(op(T,B,E)) => [term(T), binop(B), expr(E)].
  expr(T) => [term(T)].

  term(N) => [num(N)].
  term(L) => [lvalue(L)].
  term(pre(O,L)) => [incrop(O), lvalue(L)].
  term(post(L,O)) => [lvalue(L), incrop(O)].
  term(E) => ['(', expr(E), ')'].

  lvalue($(E)) => [$, expr(E)].

  incrop(++) => [++].
  incrop(--) => [--].

  binop(+) => [+].
  binop(-) => [-].

  num(0) => [0].
  num(1) => [1].
  num(2) => [2].
  num(3) => [3].
  num(4) => [4].
  num(5) => [5].
  num(6) => [6].
  num(7) => [7].
  num(8) => [8].
  num(9) => [9].

and the goal is to parse the input according to the rules, and separate the remaining suffix. For example,

| ?- parse_prefix(expr(E), [$,1,+,2], S).

E = op($(1),+,2)
S = [] ? ;

E = $(op(1,+,2))
S = [] ? ;

E = $(1)
S = [+,2] ? ;

no

and

| ?- parse_prefix(expr(E), [9], S).

E = 9
S = [] ? ;

no


| ?- parse_prefix(expr(E), [9,+,$,1,+], S).

E = op(9,+,$(1))
S = [+] ? ;

E = 9
S = [+,$,1,+] ? ;

I have written the following predicates:

%Base Case: No Token, No Suffix
parse_prefix(_,[],[]).

%Direct Matching: ex) parse_prefix(num(9),[9],S)
parse_prefix(NT,[Head|Tail],S):-
    NT =>[Head],
    write('two '),
    parse_prefix(expr(E),Tail,S).
%General Matching: ex) parse_prefix(expr(E),_,S)
parse_prefix(NT,[Head|Tail],S):-
    NT => [Head1|Tail1],
    %write(Head1),
    %write('one\n'),
    parse_prefix(Head1,[Head|Tail],S).

and I'm having a lot of confusion with recursion and backtracking..

I will permanently love anyone who can help me this one.

Thank you in advance.


Solution

  • You are already close to a solution. It is good to define your own operator =>/2 to represent your own gammar rules and not get into conflict with -->/2. But I am having problems with the representation of grammar rule bodies. I don't see that you distinguish terminals and non-terminals in the grammar rule bodies.

    One suggestion would be to vote for (A1,...,An) to represent a conjunction in the body, instead of [A1,..,An]. And then use [T] for terminals and NT for non-terminals. So the following rule,

    term(E) => ['(', expr(E), ')'].
    

    would then read:

    term(E) => ['('], expr(E), [')'].
    

    You can then adapt your rules and define a parse_prefix/3 as follows. I show you the terminal and the conjunction and the non-terminal case:

    parse_prefix([T],I,O) :- !, I=[T|O].
    parse_prefix((A,B),I,O) :- !, parse_prefix(A,I,H), parse_prefix(B,H,O).
    parse_prefix(NT,I,O) :- (NT => Body), parse_prefix(Body,I,O).
    

    You can add additional cases for the empty production ([]) and auxiliary conditions ({}), or make it more flexible to be able to work with terminal lists ([T1,..,Tn]). Also further control constructs are possible, but when you try to do a cut (!) things get a little bit nasty when following the meta-interpreter approach.

    Instead of writing a meta-interpreter parse_prefix/3 you could also cook your own term rewriting to finally arrive at a method that would convert the given gammar rules first into ordinary Prolog and then execute them from there.