parsingcontext-free-grammarcontext-sensitive-grammar

how to parse Context-sensitive grammar?


CSG is similar to CFG but the reduce symbol is multiple.

So, can I just use CFG parser to parse CSG with reducing production to multiple terminals or non-terminals?

Like

1. S → a bc
2. S → a S B c
3. c B → W B
4. W B → W X
5. W X → B X
6. B X → B c
7. b B → b b

When we meet W X, can we just reduce W X to W B?

When we meet W B, can we just reduce W B to c B?

So if CSG parser is based on CFG parser, it's not hard to write, is it true?

But when I checked wiki, it said to parse CSG, we should use linear bounded automaton.

What is linear bounded automaton?


Solution

  • Context sensitive grammars are non-deterministic. So you can not assume that a reduction will take place, just because the RHS happens to be visible at some point in a derivation.

    LBAs (linear-bounded automata) are also non-deterministic, so they are not really a practical algorithm. (You can simulate one with backtracking, but there is no convenient bound on the amount of time it might take to perform a parse.) The fact that they are acceptors for CSGs is interesting for parsing theory but not really for parsing practice.

    Just as with CFGs, there are different classes of CSGs. Some restricted subclasses of CSGs are easier to parse (CFGs are one subclass, for example), but I don't believe there has been much investigation into practical uses; in practice, CSGs are hard to write, and there is no obvious analog of a parse tree which can be constructed from a derivation.

    For more reading, you could start with the wikipedia entry on LBAs and continue by following its references. Good luck.