rubyparslet

How do I get my parser atom to terminate inside a rule including optional spaces?


I can get the atoms to parse individually, but when I chain them using >> the parser doesn’t seem to want to leave the :integer rule.

I get this error:

Extra input after last repetition at line 1 char 2.
`- Expected one of [VALUE, BOOL_OPERATION] at line 1 char 2.
   |- Expected at least 1 of [0-9] at line 1 char 2.
   |  `- Failed to match [0-9] at line 1 char 2.
   `- Failed to match sequence (VALUE BOOL_COMPARISON VALUE) at line 1 char 2.
      `- Expected at least 1 of [0-9] at line 1 char 2.
         `- Failed to match [0-9] at line 1 char 2.

When running the following code:

require 'minitest/autorun'
require 'parslet'
require 'parslet/convenience'

class ExpressionParser < Parslet::Parser
  # Single chars
  rule(:space) { match('\s').repeat(1) }
  rule(:space?) { space.maybe }

  # Values
  rule(:integer) { match('[0-9]').repeat(1).as(:integer) } 
  rule(:value) { integer }

  # Operators
  rule(:equals) { str('=').repeat(1,2).as(:equals) }     
  rule(:bool_comparison) { space? >> equals >> space?}

  # Grammar  
  rule(:bool_operation) { value >> bool_comparison >> value }      
  rule(:subexpression) {(value | bool_operation).repeat(1)}

  root(:subexpression)
end

class TestExpressions < Minitest::Unit::TestCase
  def setup
    @parser = ExpressionParser.new
  end

  def test_equals
    assert @parser.value.parse("1")
    assert @parser.bool_comparison.parse("==")
    assert @parser.parse_with_debug("1 == 1")
  end
end

Solution

  • It's like writing some code if (consume_value || consume_expression), it's going to succeed to consume the value, and never try to consume the expression.

    Parslet will try to match your options in the order they are defined. If it can consume some of the input stream without any conflict, it's considered a successful match. As it succeeded matching value it has no reason to try matching subexpression.

    So, as your example expression 1 == 1 starts with a valid "value" and you have told it to try matching against value first, ((value | bool_operation)) it tries and succeeds. The error generated (Extra Input) means "I matched input successfully, but there seems to be stuff left over."

    You need to match the complex case before the simple case, when one is a subset of the other. That way the complex one can fail, and you fall back to the simple case.

    Change the rule to rule(:subexpression) {(bool_operation | value).repeat(1)}.