regexformal-languageslanguage-theory

How to write a concise regular expression for all strings containing "a"s, "b"s, and "c"s but no more than 2 "b"s and 3 "c"s


I very recently began learning regular expressions and was trying to write one for the question above. It would not be difficult if the limits were only placed on one letter (e.g. no more than 2 "b"s).

Then the answer would be: a* c*(b|ε)a* c*(b|ε)a* c*

But with 2 "b"s and 3 "c"s, the total number of possible orderings between the "a"s is 24 (5 choose 3), so writing a regular expression to contain all those possibilities would be very hefty (since we can choose any number of bs and cs as long as the number is less than 2 and 3 respectively) (ex. bcbcc, cbbcc, bcbc, bcc, b, c,...).

So is it possible to write a concise regular expression for the question or can at least the writing out of the possibilities by simplified?


Solution

  • I think in this case you want to negate what you're looking for because finding more than two b's or c's is easy. You can do this(?!.*b.*b.*|.*c.*c.*c.*) and say, no more that 2 b's and 3 c's