ccompiler-constructionebnf

Describing c with ebnf grammar


This is a grammar in EBNF to describe C:

stmt->  (CASE CONST ‘:’)* expression ‘;’
      | (CASE CONST ‘:’)* IF ‘(’ expression ‘)’ stmt [ELSE stmt]
      | (CASE CONST ‘:’)* WHILE ‘(’ expression ‘)’ stmt
      | (CASE CONST ‘:’)* SWITCH ‘(’ expression ‘)’ stmt
      | (CASE CONST ‘:’)* RETURN [expression] ‘;’
      | (CASE CONST ‘:’)* BREAK ‘;’
      | (CASE CONST ‘:’)* CONTINUE ‘;’
      | ‘{’ (stmt)* ‘}’

I want to modify the above with these restrictions:

  1. The CASE labelcan only be used on commands that are directly contained in a command SWITCH, and not inside its nested IF or WHILE commands.
  2. The BREAK command can only appear inside a WHILE command or one SWITCH command, including these nested commands.
  3. The CONTINUE command can only appear inside a WHILE command, including nested commands.

My answer:

stmt-> ( CONST ‘:’)* expression ‘;’
     | ( CONST ‘:’)* IF ‘(’ expression ‘)’ stmt [ELSE stmt]
     | (CASE CONST ‘:’)* SWITCH ‘(’ expression ‘)’ stmt
     | ( CONST ‘:’)* WHILE ‘(’ expression ‘)’ stmt
     | (CASE CONST ‘:’)* BREAK ‘;’
     | (CASE CONST ‘:’)* CONTINUE ‘;’
     | ( CONST ‘:’)* RETURN [expression] ‘;’
     | ‘{’ (stmt)* ‘}’

Is this right?


Solution

  • The task you face up is very complicated. As an approach (I don't think there's enough place here to explain it fully) you can first start writting to similar stmt nonterminals, one that accepts case statements on it, and one that doesn't. They must be both derived from the original stmt you have posted.

    stmt_w_case ->  (CASE CONST ‘:’)* expression ‘;’
          | (CASE CONST ‘:’)* IF ‘(’ expression ‘)’ stmt [ELSE stmt]
          | (CASE CONST ‘:’)* WHILE ‘(’ expression ‘)’ stmt
          | (CASE CONST ‘:’)* SWITCH ‘(’ expression ‘)’ stmt
          | (CASE CONST ‘:’)* RETURN [expression] ‘;’
          | (CASE CONST ‘:’)* BREAK ‘;’
          | (CASE CONST ‘:’)* CONTINUE ‘;’
          | ‘{’ (stmt)* ‘}’
    
    stmt_wo_case ->  expression ‘;’
          | IF ‘(’ expression ‘)’ stmt [ELSE stmt]
          | WHILE ‘(’ expression ‘)’ stmt
          | SWITCH ‘(’ expression ‘)’ stmt
          | RETURN [expression] ‘;’
          | BREAK ‘;’
          | CONTINUE ‘;’
          | ‘{’ (stmt)* ‘}’
    

    Now you said that you wanted stmt_w_case inside switch statements only, then the while stmt should be changed into stmt_w_case while all the others must be changed into stmt_wo_case, as

    stmt_w_case ->  (CASE CONST ‘:’)* expression ‘;’
          | (CASE CONST ‘:’)* IF ‘(’ expression ‘)’ stmt_wo_case [ELSE stmt_wo_case]
          | (CASE CONST ‘:’)* WHILE ‘(’ expression ‘)’ stmt_wo_case
          | (CASE CONST ‘:’)* SWITCH ‘(’ expression ‘)’ stmt_w_case
          | (CASE CONST ‘:’)* RETURN [expression] ‘;’
          | (CASE CONST ‘:’)* BREAK ‘;’
          | (CASE CONST ‘:’)* CONTINUE ‘;’
          | ‘{’ (stmt_w_case)* ‘}’
    
    stmt_wo_case ->  expression ‘;’
          | IF ‘(’ expression ‘)’ stmt_wo_case [ELSE stmt_wo_case]
          | WHILE ‘(’ expression ‘)’ stmt_wo_case
          | SWITCH ‘(’ expression ‘)’ stmt_w_case
          | RETURN [expression] ‘;’
          | BREAK ‘;’
          | CONTINUE ‘;’
          | ‘{’ (stmt_wo_case)* ‘}’
    

    (see how stmt_wo_case propagates it's condition to the embedded stmt between brackets { and } and similarly for the stmt_w_case)

    Then you can say:

    stmt -> stmt_wo_case
    

    and your grammar is ready (but probably you'll run in trouble later, see below)

    In the case of the break statement, you should do the same with the new grammar, but be careful, as in this case you can have a break well nested in any of the statements of a if statement or similar. For each of the rules we have just forked two.... you need to do a different stmt_w_case_no_break and a stmt_w_case_w_break (and the same for stmt_wo_case...) and do you see where does this bring us? in each place we want some kind of rule with, and without, we are doubling the number of rules.... you grow exponentially with the number of decisions of this type you make.