parsinglanguage-designcompiler-constructionintermediate-code

Generating intermediate code in a compiler. Is an AST or parse tree always necessary when dealing with conditionals?


I'm taking a compiler-design class where we have to implement our own compiler (using flex and bison). I have had experience in parsing (writing EBNF's and recursive-descent parsers), but this is my first time writing a compiler.

The language design is pretty open-ended (the professor has left it up to us). In class, the professor went over generating intermediate code. He said that it is not necessary for us to construct an Abstract Syntax Tree or a parse tree while parsing, and that we can generate the intermediate code as we go.

I found this confusing for two reasons:

I planned on generating an AST and then walking the tree after I create it, to resolve the addresses of functions and branch targets. Is this correct or am I missing something?


Solution

  • The general solution to both of your issues is to keep a list of addresses that need to be "patched." You generate the code and leave holes for the missing addresses or offsets. At the end of the compilation unit, you go through the list of holes and fill them in.

    In FORTH the "list" of patches is kept on the control stack and is unwound as each control structure terminates. See FORTH Dimensions

    Anecdote: an early Lisp compiler (I believe it was Lisp) generated a list of machine code instructions in symbolic format with forward references to the list of machine code for each branch of a conditional. Then it generated the binary code walking the list backwards. This way the code location for all forward branches was known when the branch instruction needed to be emitted.