context-free-grammarlexical-analysis

Context free grammar for languages with more number of as than bs


The question is to develop a context free grammar for language containing all strings having more number of As than Bs.

I can't think of a logical solution . Is there a way to approach such problems , what can help me approach such problems better ? Can someone suggest a logical way to analyse such grammar problems ?


Solution

  • The following grammar generates all strings over {a,b} that have more a's than b's. I denote by eps the empty string.

    S -> Aa | RS | SRA
    A -> Aa | eps
    R -> RR | aRb | bRa | eps
    

    It's obvious it always generates more a's than b's. It's less obvious it generates all possible strings over {a,b} that have more a's than b's

    The production R -> RR | aRb | bRa | eps generates all balanced strings (this is easy to see), and the production A -> Aa generates the language a* (i.e. strings with zero or more a's).

    Here's the logic behind the grammar. Notice that if w=c1,c2,c3,...,cn is a string over {a,b} with more a's than b's then we can always decompose it into a concatenation of balanced strings (i.e. equal number of a's and b's, which includes the empty string) and strings of the form a+.

    For example, ababaaaba = abab (can be generated by R),aaa (can be generated by A),ba (can be generated by R).

    Now notice that the production S -> Aa | RS | SRA generates precisely strings of this form.

    It suffices to verify that S covers the following cases (because every other case can be covered by breaking into such subcases, as you should verify):