c++parsingbnfbnfc

BNF grammar for a simple c++ program example


So i am trying to write grammar for a simple c++ program.

this is how the grammar looks like right now:

PDefs. Program ::= [Def] ;
terminator Def "" ;
comment "//" ;
comment "/*" "*/" ;
comment "#" ;
DFun. Def ::= Type Id "(" [Arg] ")" "{" [Stm] "}" ;
separator Arg "," ;
terminator Stm "" ;
ADecl. Arg ::= Type Id ;
SExp. Stm ::= Exp ";" ;
SDecl. Stm ::= Type Id ";" ;
SDecls. Stm ::= Type Id "," [Id] ";" ;
SInit. Stm ::= Type Id "=" Exp ";" ;
SReturn. Stm ::= "return" Exp ";" ;
SWhile. Stm ::= "while" "(" Exp ")" Stm ;
SBlock. Stm ::= "{" [Stm] "}" ;
SIfElse. Stm ::= "if" "(" Exp ")" Stm "else" Stm ;


EInt. Exp15 ::= Integer ;
EDouble. Exp15 ::= Double ;
ETrue. Exp15 ::= "true" ;
EFalse. Exp15 ::= "false" ;
EId. Exp15 ::= Id ;
EApp. Exp15 ::= Id "(" [Exp] ")" ;
EPIncr. Exp14 ::= Exp15 "++" ;
EPDecr. Exp14 ::= Exp15 "--" ;
EIncr. Exp13 ::= "++" Exp14 ;
EDecr. Exp13 ::= "--" Exp14 ;
ETimes. Exp12 ::= Exp12 "*" Exp13 ;
EDiv. Exp12 ::= Exp12 "/" Exp13 ;
EPlus. Exp11 ::= Exp11 "+" Exp12 ;
EMinus. Exp11 ::= Exp11 "-" Exp12 ;
ELt. Exp9 ::= Exp9 "<" Exp10 ;
EGt. Exp9 ::= Exp9 ">" Exp10 ;
ELtEq. Exp9 ::= Exp9 "<=" Exp10 ;
EGtWq. Exp9 ::= Exp9 ">=" Exp10 ;
EEq. Exp8 ::= Exp8 "==" Exp9 ;
ENEq. Exp8 ::= Exp8 "!=" Exp9 ;
EAnd. Exp4 ::= Exp4 "&&" Exp5 ;
EOr. Exp3 ::= Exp3 "||" Exp4 ;
EAss. Exp2 ::= Exp3 "=" Exp2 ;

coercions Exp 15 ;
separator Exp "," ;
separator Id "," ;

Tbool. Type ::= "bool" ;
Tdouble. Type ::= "double" ;
Tint. Type ::= "int" ;
Tvoid. Type ::= "void" ;

token Id (letter (letter | digit | '_')*) ;

and this is the simple c++ program that needs to be parsed

// a small C++ program
#include <iostream>

int main()
{
    std::cout << "Hello, world!" << std::endl;
    return 0;
}

so when i try to parse it i get the error in line 6 meaning the std::cout line. Since i am new to bnf i do not know how to "think" to solve this. If someone could give an example of how you would go to solve a situation like this would be great.!

Thank you!


Solution

  • The line on which it fails, cannot be parsed because you are missing some rules :

    1. You need a rule for parsing qualified ids.
      A qualified id is a special type of identifier, and can (for your purposes) be used in the same situations as an (unqualified) identifier.
      std::cout and std::endl are qualified ids, and a (simplified) rule for them could look something like this :

      <qualified_id> ::= <nested_name_specifier> <unqualified_id>
      <nested_name_specifier> ::= <namespace_name> "::" <nested_name_specifier>?
      

      in which (for your purposes), <unqualified_id> and <namespace_name> can be treated as identifiers.

    2. You need a rule for parsing an expression with the << operator.
      A (simplified) rule for this additional type of expression could look something like this :

      <shift_left_expression> ::= <other_expression>
      <shift_left_expression> ::= <shift_left_expression> "<<" <other_expression>
      

      in which (for your purposes) <other_expression> stands for any other type of expression.

    3. You need a rule for parsing string literals.
      A string literal is a type of literal, and it can be used (for your purposes) as part of an expression, like an identifier.
      "Hello, world!" is a string literal, and a (simplified) rule for them could look something like this :

      <string_literal> ::= "\"" <s_char_sequence>? "\""
      <s_char_sequence> ::= <s_char>
      <s_char_sequence> ::= <s_char_sequence> <s_char>
      

      in which <s_char> is any character that you want to allow inside a string literal (to keep it simple, don't allow the " character in there eg.).