I'm currently building a multi-pass compiler for a block-structured language language. I have built a scope stack in semantic analysis phase. When entering a new scope, create a new table, push it to the stack and make it current scope table, then all symbols inside this scope are inserted into current table. When leaving a scope, the current table is recorded in the AST node, then pop it from the scope stack.
This way, in code generation phase, it does not have to build the symbol table all over again. Once it enters a new scope, it can simplely get the table from AST node and then push it to scope stack. I think this is the way that most of the compiler text books have recommended.
In most cases this works just fine, however, there is a corner case I don't know how to properly deal with. Consider the following code example:
int a = 1;
int b = 2;
void test()
{
int a = b;
int b = 3;
}
It has two scopes: the global scope and the test()'s scope. So, to do code generation for test(), we have to:
Now, when dealing with "int a = b;", it would find the local vaiable b from the scope stack, which is obviously not correct since local b is not declared yet.
Any idea how to deal with this problem? Do I have to destroy all the symbol when leaving a scope and build the symbol table all over again in code generation phase?
Thanks guys!
One solution to this problem it to let the AST node for an identifier contain a link to the specific symbol found in the symbol table at the time of its creation. Under the assumption that the program text is parsed sequentially from beginning to end, one statement at the time, this will give the correct symbol. This also removes the need for doing repeated lookup in the symbol table.