I'm trying to get what seems like a very basic Marpa grammar working. The code I use is below:
use strict;
use warnings;
use Marpa::R2;
use Data::Dumper;
my $grammar = Marpa::R2::Scanless::G->new(
{
source => \(<<'END_OF_SOURCE'),
:start ::= ExprSingle
ExprSingle ::= Expr AndExpr
Expr ~ word
AndExpr ~ word*
word ~ [\w]+
:discard ~ ws
ws ~ [\s]+
END_OF_SOURCE
}
);
my $reader = Marpa::R2::Scanless::R->new(
{
grammar => $grammar,
}
);
my $input = 'foo';
$reader->read(\$input);
my $value = $reader->value;
print Dumper $value;
This prints $VAR1 = \'foo';
. So it recognizes one word just fine. But I want it to recognize a string of words
my $input='foo bar'
Now the script prints:
Error in SLIF G1 read: Parse exhausted, but lexemes remain, at position 4
I think this is because ExprSingle
uses the ~
(match) operator, which makes it part of the tokenizing level, G0, instead of the structural level G1; the :discard
rule allows space between G1 rules, not G0 ones. So I change the grammar like so:
ExprSingle ::= Expr AndExpr
Now no warning is printed, but the resulting value is undef
instead of something containing 'foo'
and 'bar'
. I'm honestly not sure what that means, since, before, the failed parse threw an actual error.
I tried changing the grammar to separate what I think are G0 and G1 rules further, but still no luck:
:start ::= ExprSingle
ExprSingle ::= Expr AndExpr
Expr ::= token
AndExpr ::= token*
token ~ word
word ~ [\w]+
:discard ~ ws
ws ~ [\s]+
The final value is still undef
. trace_terminals
shows both 'foo' and 'bar' being accepted as tokens. What do I need to do to fix this grammar (by which I mean get a value containing the strings 'foo' and 'bar' instead of just undef
)?
Rules by default return a value of undef, so in your case a return of \undef from $reader->value() means your parse succeeded. That is, a return of undef means failure, while a return of \undef means success where the parse evaluated to undef.
A good, fast way to start with a more helpful semantics is to add the following line:
:default ::= action => ::array
This causes the parse to generate an AST.