compiler-constructionbootstrapping

Writing a compiler in its own language


Intuitively, it would seems that a compiler for language Foo cannot itself be written in Foo. More specifically, the first compiler for language Foo cannot be written in Foo, but any subsequent compiler could be written for Foo.

But is this actually true? I have some very vague recollection of reading about a language whose first compiler was written in "itself". Is this possible, and if so how?


Solution

  • This is called "bootstrapping". You must first build a compiler (or interpreter) for your language in some other language (usually Java or C). Once that is done, you can write a new version of the compiler in language Foo. You use the first bootstrap compiler to compile the compiler, and then use this compiled compiler to compile everything else (including future versions of itself).

    Most languages are indeed created in this fashion, partially because language designers like to use the language they are creating, and also because a non-trivial compiler often serves as a useful benchmark for how "complete" the language may be.

    An example of this would be Scala. Its first compiler was created in Pizza, an experimental language by Martin Odersky. As of version 2.0, the compiler was completely re-written in Scala. From that point on, the old Pizza compiler could be completely discarded, due to the fact that the new Scala compiler could be used to compile itself for future iterations.