rubyparsingparslet

Why does Parslet (in Ruby) return an empty array when parsing an empty string literal?


I'm playing with parslet. This is trivial parser which shows me some non-obvious behavior.

require 'parslet'

class Parser < Parslet::Parser
  rule(:quote) { str('"') }
  rule(:escape_char) { str('\\') }
  def quoted(term)
    quote >> term >> quote
  end
  rule(:string) {
    quoted( (escape_char >> any | quote.absent? >> any).repeat.as(:string) )
  }
end

Obviously, It should parse double-qouted string. And it does. But following result seems strange for me.

Parser.new.string.parse '""'

This code returns {:string=>[]}. Why empty array there but not empty string? What am I missing?

I'm using ruby 2.1.1 and parslet 1.6.1


Solution

  • TL;DR; - As a rule Parslet's as applied to repeat captures an array of matches; except in the special case where all the matches are raw strings, in which case it joins them and returns the resulting string.

    In your code, the repeat doesn't know the types it would capture as there aren't any, so it returns the empty array.

    In this example... the empty array seems like the right choice.

    require 'parslet'
    
    class Parser < Parslet::Parser
      rule(:quote) { str('"') }
      rule(:escape_char) { str('\\') }
      def quoted(term)
        quote >> term >> quote
      end
      rule(:string) {
        quoted( (escape_char >> any | quote.absent? >> any).as(:char).repeat.as(:string) )
      }
    end
    
    puts Parser.new.string.parse('""').inspect # => {:string=>[]}
    puts Parser.new.string.parse('"test"').inspect 
        # =>  {:string=>[{:char=>"t"@1}, {:char=>"e"@2}, {:char=>"s"@3}, {:char=>"t"@4}]}
    

    When the child nodes are just strings Parslet concatenates then into a single string. When there are no elements in the collection it defaults to empty collection instead of empty string.

    maybe is different.

    From http://kschiess.github.io/parslet/parser.html # Repetition and its Special Cases

    These all map to Parslet::Atoms::Repetition. Please note this little twist to #maybe:

    str('foo').maybe.as(:f).parse('') # => {:f=>nil}
    str('foo').repeat(0,1).as(:f).parse('') # => {:f=>[]} The

    ‘nil’-value of #maybe is nil. This is catering to the intuition that foo.maybe either gives me foo or nothing at all, not an empty array. But have it your way!