compiler-constructiontypecheckingsymbol-tablecompile-time-type-checking

Symbol table in a dynamic type-checking compiled language


I'm trying to create a dynamic type-checking compiled language and now I'm just a little bit confused about this:

-A compiled language always has a static type-checking
-Phases of any compiler must have the same order-
for example, the symbol table must be created at Lexical Analysis phases and it must be connected with each phase like the following diagram.

enter image description here

Are the above terms true?
AND the real question is when(which phase) the symbol table must be created for this language?


Solution

  • No, none of these are true. Let's take each in turn:

    A compiled language always has a static type-checking

    For starters, I'd push back against the term "compiled language." There are some languages that are often compiled and some languages that are often interpreted, but there isn't a hard and fast line between them. For example, some JavaScript implementations work by interpreting parts of the code and compiling others. Similarly, in the 1980s and 1990s some universities used to teach C using a C interpreter that made it easier to debug and introspect on what was going on, even though C is almost always compiled.

    With that in mind, no, it's not the case that all compiled languages have static type-checking. You could write a compiler for Python that generates code that, at runtime, figures out the types of the arguments to each expression and reports an error if the types aren't permitted for the given operation.

    It's often the case that if a language is designed with the expectation that it will be compiled, then the language will use static type-checking. The main reason for this is that static type-checking requires doing some global analysis of the code to make sure the types check, which has an up-front cost to it. If you're already planning on spending up-front time processing code (say, if you're compiling it), that's not a huge problem. However, if you're making an interpreter, then the cost of doing the type-checking might slow down your interpreter's startup in a way that favors compilation.

    Phases of any compiler must have the same order - for example, the symbol table must be created at Lexical Analysis phases and it must be connected with each phase like the following diagram.

    No, that's not the case either. Although these phases of compilation are often taught as being distinct, in many compilers they're all blended together or potentially subdivided even further. For example, a compiler might start doing optimizations during parsing if it recognizes that certain operations can be eliminated or folded away, or might first translate the code into another language before doing any semantic analysis.

    (Also, symbol tables typically are not generated during lexical analysis - that would probably be during either syntax analysis or semantic analysis, once the global structure of the program has been figured out.)

    And the real question is when (which phase) the symbol table must be created for this language?

    That's really up to you to decide. Semantic analysis is likely a good place to do this, since at that point you'd have the whole program structure available to you, though it could conceivably be folded into parsing and generated as the AST is built up.

    Depending on the language semantics and scoping rules, you may need to completely defer this to runtime. For example, if a language can create variables whose lifetime extends indefinitely and can choose names for those variables at runtime, then you'd need some sort of dynamic table of what exists at any moment in time to look things up. But not all dynamically-typed languages do this, so this might not be necessary.

    Hope this helps!