Given the input:
alpha beta gamma one two three
How could I parse this into the below?
[["alpha"; "beta"; "gamma"]; ["one"; "two"; "three"]]
I can write this when there is a better separator (e.g.__), as then
sepBy (sepBy word (pchar ' ')) (pstring "__")
works, but in the case of double space, the pchar in the first sepBy consumes the first space and then the parser fails.
The FParsec manual says that in sepBy p sep
, if sep
succeds and the subsequent p
fails (without changing the state), the entire sepBy
fails, too. Hence, your goal is:
sepBy
loop closed happily and passed control to the "outer" sepBy
loop.Here's how to do the both:
// this is your word parser; it can be different of course,
// I just made it as simple as possible;
let pWord = many1Satisfy isAsciiLetter
// this is the Inner separator to separate individual words
let pSepInner =
pchar ' '
.>> notFollowedBy (pchar ' ') // guard rule to prevent 2nd space
|> attempt // a wrapper that fails NON-fatally
// this is the Outer separator
let pSepOuter =
pchar ' '
|> many1 // loop over 1+ spaces
// this is the parser that would return String list list
let pMain =
pWord
|> sepBy <| pSepInner // the Inner loop
|> sepBy <| pSepOuter // the Outer loop
Use:
run pMain "alpha beta gamma one two three"
Success: [["alpha"; "beta"; "gamma"]; ["one"; "two"; "three"]]