rubylinesparslet

Ruby parslet: parsing multiple lines


I'm looking for a way to match multiple lines Parslet. The code looks like this:

rule(:line) { (match('$').absent? >> any).repeat >> match('$') }
rule(:lines) { line.repeat }

However, lines will always end up in an infinite loop which is because match('$') will endlessly repeat to match end of string.

Is it possible to match multiple lines that can be empty?

irb(main)> lines.parse($stdin.read)
This
is

a
multiline

string^D

should match successfully. Am I missing something? I also tried (match('$').absent? >> any.maybe).repeat(1) >> match('$') but that doesn't match empty lines.

Regards,
Danyel.


Solution

  • I think you have two, related, problems with your matching:

    Much safer to use \n as the end-of-line character. I got the following to work (I am somewhat of a beginner with Parslet myself, so apologies if it could be clearer):

    require 'parslet'
    
    class Lines < Parslet::Parser
        rule(:text) { match("[^\n]") }
        rule(:line) { ( text.repeat(0) >> match("\n") ) | text.repeat(1) }
        rule(:lines) { line.as(:line).repeat }
        root :lines
    end
    
    s = "This
    is
    
    a
    multiline
    string"
    
    p Lines.new.parse( s )
    

    The rule for the line is complex because of the need to match empty lines and a possible final line without a \n.

    You don't have to use the .as(:line) syntax - I just added it to show clearly that the :line rule is matching each line individually, and not simply consuming the whole input.