Using the parser generator nom, how can I write a parser which extracts the difference of the minus sign in the terms 1-2
and 1*-2
?
In the first example, I expect the tokens 1
, -
and 2
. In the second the "minus" sign specifies the number being negative. The expected tokens are 1
, *
and -2
. Not 1
, *
, -
and 2
.
How can I make nom stateful, with user-defined states such as expect_literal: bool
?
The best solution I found for now is using nom_locate with a span defined as
use nom_locate::LocatedSpanEx;
#[derive(Clone, PartialEq, Debug)]
struct LexState {
pub accept_literal: bool,
}
type Span<'a> = LocatedSpanEx<&'a str, LexState>;
Then you can modify the state via
fn set_accept_literal(
value: bool,
code: IResult<Span, TokenPayload>,
) -> IResult<Span, TokenPayload> {
match code {
Ok(mut span) => {
span.0.extra.accept_literal = value;
Ok(span)
}
_ => code,
}
}
where TokenPayload
is an enum representing my token content.
Now you can write the operator parser:
fn mathematical_operators(code: Span) -> IResult<Span, TokenPayload> {
set_accept_literal(
true,
alt((
map(tag("*"), |_| TokenPayload::Multiply),
map(tag("/"), |_| TokenPayload::Divide),
map(tag("+"), |_| TokenPayload::Add),
map(tag("-"), |_| TokenPayload::Subtract),
map(tag("%"), |_| TokenPayload::Remainder),
))(code),
)
}
And the integer parser as:
fn parse_integer(code: Span) -> IResult<Span, TokenPayload> {
let chars = "1234567890";
// Sign ?
let (code, sign) = opt(tag("-"))(code)?;
let sign = sign.is_some();
if sign && !code.extra.accept_literal {
return Err(nom::Err::Error((code, ErrorKind::IsNot)));
}
let (code, slice) = take_while(move |c| chars.contains(c))(code)?;
match slice.fragment.parse::<i32>() {
Ok(value) => set_accept_literal(
false,
Ok((code, TokenPayload::Int32(if sign { -value } else { value }))),
),
Err(_) => Err(nom::Err::Error((code, ErrorKind::Tag))),
}
}
This might not win a beauty contest but it works. The remaining pieces should be trivial.