theorycontext-sensitive-grammarchomsky-hierarchy

Chomsky hierarchy - Type-1 context-sensitive languages


I try to unterstand the Chomsky hierarchy with the different levels.

I checked some examples and here is one I don't really understand. Maybe someone knows why this one is not a context-sensitive language:

S → aA 
A → aA | ε
B → bA

Solution

  • The grammar G that you provide is, beyond any doubt, context free (each rule has one non-terminal in the LHS and a string of terminals and non-terminals in the RHS). Therefore, the language L that it generates is context free. Therefore, as the category of context-free languages is a proper subset of that of context-sensitive languages, language L is context sensitive. (I don't know where you read that language L is not context sensitive. Either you misread this or what you read is simply wrong.)

    I will make a guess about the reason of this confusion. Let us suppose that you had claimed that grammar G (not language L) is not context sensitive. Now, perhaps curiously enough, if a language is context sensitive this does not mean that all grammars producing this language are context sensitive (and the same is true for the other categories as well, except of course the unrestricted one). If a language is context sensitive, this means that there is a context-sensitive grammar producing it. So, even if L is context sensitive, this does not necessarily mean that G is also context sensitive.

    It would however be weird if G, a context-free grammar, were not context sensitive. Whether this is or is not true depends on your exact definition of a context-sensitive grammar. As you can read, e.g., in the related Wikipedia article, there are at least two alternative definitions for context-sensitive grammars:

    1. A grammar where all rules are of the form αAβ → αγβ, where A is a non-terminal symbol and α, β, γ are strings of terminals and non-terminals.
    2. A grammar where all rules are of the form α → β, where α and β are strings of terminals and non-terminals, but the length of α is not larger than the length of β. As an exception, there can be a rule S → ε for the start symbol S (which would otherwise be forbidden).

    According to definition 1, grammar G is context sensitive (the context strings α and β being trivially empty). According to definition 2, however, grammar G is not, because of the empty production rule for non-terminal A, which is not the start symbol.

    It is however possible to provide an equivalent grammar G', which is context sensitive according to both definitions and produces the same language L. One such grammar can be constructed as follows:

    A produces strings of zero or more "a". We can replace its definition by:

    A → A' | ε
    A' → a | aA'
    

    where A' produces strings of one or more "a". Notice that the rules for A are not recursive. We can substitute the productions for A in the rules for S and B and then eliminate the non-terminal A. This gives us:

    S → aA' | a
    A' → a | aA'
    B → bA' | b
    

    This grammar (which can be further simplified by noticing that A' and S produce the same language) is context sensitive according to both definitions above.