I have made lex/bison parser where I am using lex named token rules like: a {return Tok_A;} and yacc has declaration of this token: %token Tok_A then grammar follows. Everything works fine, if the string is right, it accepts. Now I try to make more general parser using directly the alphabet in lex. For some reason yacc gives me invalid token when I want to send "a" character:
//parser.l
%{
#include "parser4.tab.h"
%}
%%
[a-h] {return *yytext;}
\n {return 0;} /* EOF */
%%
//parser.y
%{
extern void yyerror(char *);
extern int yylex(void);
#define YYDEBUG 1
%}
%token a
%%
S : a {printf("S->a");}
%%
int main(void)
{
#if YYDEBUG
yydebug = 1;
#endif
if(!yyparse())
printf("End of input reached\n");
return 0;
}
void yyerror (char *s)
{
/* fprintf (stderr, "%s\n", s); */
printf("Incorrect derivation!\n");
}
When I compile, start and give program input a, its output is:
Starting parse
Entering state 0
Stack now 0
Reading a token
a
Next token is token "invalid token" ()
Incorrect derivation!
Cleanup: discarding lookahead token "invalid token" ()
Stack now 0
I think the trick is in lex and the rule return yytext. If I understand it right, yacc and lex communicate through parser.tab.h. There are definitions for token translation int to token name. From int 257. 0-255 are used for classic characters. So should I somehow translate the token in lex to ASCII? I thought when lex sends directly the "a" char, bison/yacc would understand it.
When you declare %token a
it defines a
as a name for a token, which you could return from lex. But that is not the same as the character 'a'
. If you want to use the character 'a'
as a token in the grammar, you DON'T need to declare it, but you DO need single-quotes around it, as 'a'
and not a
In your case, change the yacc grammar to
S : 'a' {printf("S->a");}
and it will work