parsingcontext-free-grammarcontext-sensitive-grammarchomsky-hierarchy

Converting a language specification into production rules (not sure if it's a CFG or CSG)


I have to write a function that checks whether input strings are valid for a given language specification. I thought that this would be a standard CFG -> Chomsky Normal Form, then CYK parsing, but one of the rules in the language is preventing this from happening.

Some of the rules are straightforward, if we define terminals {a,b,c,d,e,f,P,Q,R,S}, then valid strings are

1) Any of the lowercase terminals in isolation
2) If 'x' is a valid string, then so is Sx

But the third rule is

3) If X and Y are valid input strings, then so are PXY, QXY, RXY

where {P,Q,R} are the remaining uppercase terminals and X and Y are nonterminals.

What would the production rule for this look like? I thought it would be something like

XY -> PXY (and QXY, RXY)

but there are two problems with this. The first is that this is not a CFG rule -- does that mean this language defines a CSG instead?

And the second is that this doesn't work, because

XY -> PXY -> PPXY

would not be a valid message in all cases.


Solution

  • I think that this grammar is context-free, unless I'm misinterpreting what you're saying.

    First, let's let A be the nonterminal that expands out to some valid string made just using the first two rules, we get

    A -> a | b | c | d | e | f
    

    Now, your second rule says that if you can build the string ω then you can build Sω. We could encode that as

    A -> SA
    

    Finally, you've said that if you have two strings X and Y, then you can combine them together as

    PXY
    QXY
    RXY
    

    One way to think about this would be to generate the string P, followed by any two valid strings (same for Q or R). Thus you could add the rules

    A -> PAA | QAA | RAA
    

    This gives the final grammar

    A -> a | b | c | d | e | f
    A -> SA
    A -> PAA | QAA | RAA
    

    Hope this helps!