compilationinterpretercontext-free-grammarautomataturing-machines

How can a programming language that is specified using a context-free grammar, be capable of expressing a Turing Machine?


I've been getting into Automata theory, compilers and the fundamentals of CS, but there is something fundamental that I don't understand.

I have seen the Chomsky Hierarchy of languages where different classes of languages that have different expressive power are "associated" with an equivalently powerful automaton.

From Wikipedia :

GRAMMAR LANGUAGE AUTOMATON

I've seen that every programming language are Turing Complete and that the grammar specifications of programming languages (formalised in BNF, etc..) can be expressed as a Context-free Grammar.

Context-free grammars dont have an "associated" Turing Machine as equivalent.

During interpretation / compilation, the string of the source code of a program written in a programming language (like C, python, etc..) is parsed/translated into an Abstract Syntax Tree.

(As I understand, this is like extracting an array from a string when matching the string against a regular expression, except that the pattern here is not a regular expression, it is a context-free grammar, which is more powerful, hence the tree structure extracted which contain more information that a linear array (coming from capture groups of a regex).)

So the program written, potentially implementing a Turing Machine, is converted into an Abstract Syntax Tree, and all the information contained into the original program is now incorporated into the tree. And later, during execution, the program will accompished some computation that can be as complex as a Turing Machine.

My question is : How can a string expressed within the confines of the rules dictated by what a Context-free Grammar can be, be implementing a Turing Machine while the equivalence grammar/language/automata and the Chomsky Hierarchy say a Context-free Grammar isn't expressive enough to do so ?

Is one of my assumptions wrong ? Or is the fact that memory plays a role in this, and that there is a theorem that says something like : a Turing Machine can be implemented "using" a Tree + a Stack ?

This is really bugging me.

Anything that can enlighten me is really appreciated !

EDIT :

Here's a DUPLICATE of my question :

chomsky hierarchy and programming languages

Why I mistakenly thought that the syntax specification of a programming language defines its semantics ?

Because of what YACC does : (syntax-directed translation)

https://en.wikipedia.org/wiki/Syntax-directed_translation

which associates the rules of the context-free grammar used to parse the programming language (which is used to make the abstract syntax tree) with an action. This is the source of my confusion.

For example, here's a copy paste of an extract of the source code of the perl5 interpreter. This is the file perly.y which is used to by yacc to make the first pass of compilation.

   /* Binary operators between terms */
    termbinop:  term[lhs] ASSIGNOP term[rhs]   /* $x = $y, $x += $y */
                { $$ = newASSIGNOP(OPf_STACKED, $lhs, $ASSIGNOP, $rhs); }

        |   term[lhs] POWOP term[rhs]           /* $x ** $y */
            { $$ = newBINOP($POWOP, 0, scalar($lhs), scalar($rhs)); }

        |   term[lhs] MULOP term[rhs]           /* $x * $y, $x x $y */
                {   if ($MULOP != OP_REPEAT)
                    scalar($lhs);
                    $$ = newBINOP($MULOP, 0, $lhs, scalar($rhs));
                }

This shows a one to one correspondence between a derivation rule and its associated action (what is inside curly brackets).


Solution

  • The 'level' of grammar you use to define a language determines the automaton required to recognize (parse) that language, but it is unrelated to the "power" of that language.

    E.g., if you use a Type 2 grammar (CFG) to define a language, the Chomsky hierarchy tells you that you'll need a pushdown automaton to recognize it, but the language might be a Turing-complete programming language, or it might be a language for regular expressions, or it might be a language with no computational "power" at all.

    For a more extreme example, you can imagine using a Type 3 grammar (regular expression) to define a language for 'programming' a Turing machine.

    The power of a language (in particular, whether it's Turing-complete) depends on its semantics, not its syntax.