I want to rule invalid identifiers for lex , i tried this but not working, if identifier starts with numbers must be error, there can be another things
[0-9][a-zA-Z]* fprintf(yyout,"ERROR IDENTIFIER\n");printf("%s: ERROR IDENTIFIER\n",yytext);
First of all: Welcome to StackOverflow.
Your rule should be:
[0-9]+[a-zA-Z]+
because you need at least one digit and at least one letter.
Currently your rule [0-9][a-zA-Z]*
matches things like 0
, 7
, 4Hello
, ... because *
means zero-or-more.
Typically invalid token definitions are added for better error reporting. I'm wondering if that's indeed what you're explicitly intend to do? Because normally, when you start a new grammar (assuming you are because your question is about basic Lex rules), you just specify the valid token and let Lex & Yacc error handling catch wrong input.
So, if you not intended to explicitly improve error reporting, please delete this rule and only add rules for valid tokens (for now).