regular-language

Regular expression for the following condition


Define a regular expression for the alphabet {a,b} where subsequences of a's that aren´t empty are always even and subsequences of b's are always odd.

I have something like this but I think it works such that the number of a's and b's in the expression are even and odd respectively without taking into account that each subsequence of a's has to be even numbered and b's has to be odd numbered:

(a|b(aa|bb)(ab|ba)) (aa|bb|(ab|ba)(aa|bb)(ab|ba))*


Solution

  • I'll take the long way to an answer and see if I get what you did...

    We can make a DFA for this language as follows:

           /---a---\       /---\
           |       |       |   | a,b
           V       |       V   |
    ----->q0--a-->q1--b-->q2---/
           |       ^       ^
           b       |       |
           |       a       |
           V       |       |
       /->q3-------/       |
       |   |               |
       |   b               |
       b   |               |
       |   V               |
       \--q4------a--------/
           
    

    In this DFA, states q0 and q3 are accepting, since we have just seen an odd string of b's or an even string of a's, and we have not messed up yet and landed in q2, nor are we still needing an extra a/b to get an even/odd streak.

    With a DFA in hand, we can write some equations for strings that lead to each state:

    (q0) = e + (q1)a
    (q1) = (q0)a + (q3)a
    (q2) = (q1)b + (q4)a + (q2)(a+b)
    (q3) = (q0)b + (q4)b
    (q4) = (q3)b
    

    We want to solve for (q0) and (q3). We can pretty much ignore (q2) since we don't care about strings not accepted and it's not helpful for getting anywhere else.

    (q0) = e + (q1)a
    (q1) = (q0)a + (q3)a
    (q3) = (q0)b + (q4)b
    (q4) = (q3)b
    
    (q0) = e + (q1)a
    (q1) = (q0)a + (q3)a
    (q3) = (q0)b + (q3)bb
         = (q0)b(bb)*
    
    (q0) = e + (q1)a
    (q1) = (q0)a + (q0)b(bb)*a
         = (q0)[a + b(bb)*a]
    
    (q0) = e + (q0)[a + b(bb)*a]a
         = e + (q0)[aa + b(bb)*aa]
         = [aa + b(bb)*aa]*
    

    So, the regular expression [aa + b(bb)aa] gives strings that lead to q0, and [aa + b(bb)*aa]b(bb) gives strings that lead to q3. The union of these is:

      [aa + b(bb)*aa]* + [aa + b(bb)*aa]*b(bb)*
    = [aa + b(bb)*aa]*[e + b(bb)*]
    

    This isn't very close to your expression, but that doesn't mean yours is wrong... but I think yours is wrong since if I'm reading it right, we have this derivation:

    (a|b(aa|bb)(ab|ba)) (aa|bb|(ab|ba)(aa|bb)(ab|ba))*     // initial
    (a|b(aa|bb)(ab|ba))                                    // take * = 0
    a                                                      // choose a
    

    And a is not a string you want.