perlmarpa

Marpa parser can't seem to cope with optional first symbol?


I've been getting to grips with the Marpa parser and encountered a problem when the first symbol is optional. Here's an example:

use strict;
use warnings;
use 5.10.0;

use Marpa::R2;
use Data::Dump;

my $grammar = Marpa::R2::Scanless::G->new({source  => \<<'END_OF_GRAMMAR'});
:start ::= Rule
Rule ::= <optional a> 'X'
<optional a> ~ a *
a ~ 'a'
END_OF_GRAMMAR

my $recce = Marpa::R2::Scanless::R->new({grammar => $grammar});
dd $recce->read(\"X");

When I run this, I get the following error:

Error in SLIF parse: No lexemes accepted at line 1, column 1
* String before error:
* The error was at line 1, column 1, and at character 0x0058 'X', ...
* here: X
Marpa::R2 exception at small.pl line 20
 at /usr/local/lib/perl/5.14.2/Marpa/R2.pm line 126
        Marpa::R2::exception('Error in SLIF parse: No lexemes accepted at line 1, column 1\x{a}...') called at /usr/local/lib/perl/5.14.2/Marpa/R2/Scanless.pm line 1545
        Marpa::R2::Scanless::R::read_problem('Marpa::R2::Scanless::R=ARRAY(0x95cbfd0)', 'no lexemes accepted') called at /usr/local/lib/perl/5.14.2/Marpa/R2/Scanless.pm line 1345
        Marpa::R2::Scanless::R::resume('Marpa::R2::Scanless::R=ARRAY(0x95cbfd0)', 0, -1) called at /usr/local/lib/perl/5.14.2/Marpa/R2/Scanless.pm line 926
        Marpa::R2::Scanless::R::read('Marpa::R2::Scanless::R=ARRAY(0x95cbfd0)', 'SCALAR(0x95aeb1c)') called at small.pl line 20

Perl version 5.14.2 (debian wheezy)
Marpa version 2.068000

(I see there's a brand new Marpa 2.069 that I haven't tried yet)

Is this something I'm doing wrong in my grammar?


Solution

  • In Marpa Scanless, your grammar has two levels: The main, high-level grammar where you can attribute actions and such, and the low-level lexing grammar. They are executed independently (which is expected if you have used traditional parser/lexers, but is very confusing when you come from regexes to Marpa).

    Now on the low level grammar, Marpa recognizes your input as a single X, not “zero as and then an X”. However, the high-level grammar requires the optional a symbol to be present.

    There best way around that is to make the a optional in the high-level grammar:

    <optional a> ::= <many a>
    <optional a> ::=  # empty
    
    <many a> ~ a*  # would work the same here with "a+"
    a ~ 'a'