cparsingcompiler-constructionlexical-analysis

How to make C language context-free?


I know that C is not a context-free language, a famous example is:

int foo;
typedef int foo;
foo x;

In this case the lexer doesn't know, whether foo in the 3rd line, is an identifier, or typedef.

My question is, is this the only reason that makes C a Context-Sensitive Language?

I mean, if we get rid of typedef, would it become context-free language? Or there are other reasons (examples) that prevent it from being so?


Solution

  • The post-processor C syntax can be parsed with a classical lex + yacc combo. The lexer definition and the yacc grammar are freely available at

    http://www.quut.com/c/ANSI-C-grammar-l-2011.html and http://www.quut.com/c/ANSI-C-grammar-y-2011.html

    As you can see from the lex file, it's straightforward except for the context-sensitive check_type() (and comment(), but comment processing technically belongs to the preprocessor), which makes typedef the only source of context-sensitivity there. Since the yacc file doesn't contain any context-sensitivity introducing tricks either, a typedef-less C would be a perfectly context-free syntax.

    The subsequent typechecking of C (matching declarations with use sites) is context sensitive, so you could say that overall, C is context sensitive.