regexcomputer-scienceautomata

What is the difference between (a+b)* and (a*b*)*?


Assuming that Σ = {a, b}, I want to find out the regular expression (RE) Σ* (that being the set of all possible strings over the alphabet Σ).

I came up with below two possibilities:

(a+b)*
(a*b*)*

However, I can't decide by myself which RE is correct, or if both are bad. So, please tell me the correct answer.


Solution

  • The + operator is typically used to indicate union (|, "or") in academic regular expressions, not "one or more" as it typically means in non-academic settings (such as most regex implementations).

    So, a+b means [ab] or a|b, thus (a+b)* means any string of length 0 or more, containing any number of as and bs in any order.

    Likewise, (a*b*)* also means any string of length 0 or more, containing any number of as and bs in any order.

    The two expressions are different ways of expressing the same language.