ruststack-overflowrecursive-datastructuresnom

Proper way of parsing recursive enum with Nom


I want to parse recursive enum representing arithmetic expressions using Nom. I have the following code.

#[derive(Debug, PartialEq, Eq)]
enum ArithExp {
    Const(i32),
    Add(Rc<ArithExp>, Rc<ArithExp>),
}

#[allow(dead_code)]
fn parse_ae_const(input: &str) -> IResult<&str, ArithExp> {
    let (input, (numb, _)) = pair(digit1, multispace0)(input)?;
    println!("ok: {}", numb);
    Ok((input, ArithExp::Const(numb.parse::<i32>().unwrap())))
}

#[allow(dead_code)]
fn parse_ae_add(input: &str) -> IResult<&str, ArithExp> {
    let (input, (exp1, exp2)) = terminated(
        separated_pair(parse_arithexp, tag(" + "), parse_arithexp),
        multispace0,
    )(input)?;
    Ok((input, ArithExp::Add(Rc::new(exp1), Rc::new(exp2))))
}

#[allow(dead_code)]
fn parse_arithexp(input: &str) -> IResult<&str, ArithExp> {
    alt((
        parse_ae_add,
        parse_ae_const,
    ))(input)
}

#[test]
fn arithexp1() {
    assert_eq!(parse_ae_const("3"), Ok(("", ArithExp::Const(3))));
}

#[test]
fn arithexp2() {
    assert_eq!(parse_arithexp("3"), Ok(("", ArithExp::Const(3))));
}

arithexp1 passes just fine, arithexp2 causes stack overflow on the other hand:

thread 'parser::tests::arithexp2' has overflowed its stack
fatal runtime error: stack overflow

It looks that parse_ae_add causes a problem what may indicate that I'm parsing recursive enum incorrectly but I have literally no idea how to fix it.


Solution

  • As kmdreko's comment outlines, this is because parse_arithexp calls parse_ae_add, which immediately calls parse_arithexp on the same input. This is called left-recursion, and is something that cannot be directly evaluated by recursive descent parsers (like this one). Instead, you must add a 'base case': instead of trying to parse ae_add -> ae_add " + " ae_add directly, you can parse ae_add -> ae_const " + " ae_add and achieve the same thing:

    #[allow(dead_code)]
    fn parse_ae_add(input: &str) -> IResult<&str, ArithExp> {
        let (input, (exp1, exp2)) = terminated(
            separated_pair(parse_ae_const, tag("+ "), parse_arithexp),
            multispace0,
        )(input)?;
        Ok((input, ArithExp::Add(Rc::new(exp1), Rc::new(exp2))))
    }
    

    (Note: tag(" + ") was changed to tag("+ "), because the parse_ae_const parser consumes any whitespace after it.)