I need to design a regex that matches 4 parts separated by a delimiter (:
in this case). The 4 parts must be in order, and the delimiter must be present between each part, but the parts are all optional.
These should all match:
A
B
C
D
A:B
A:C
A:D
B:C
B:D
C:D
A:B:C
A:B:D
A:C:D
B:C:D
A:B:C:D
Meanwhile, these should not match.
A: // cannot end with delimiter
:C // cannot start with delimiter
ABC // cannot omit delimiter
A:BC // cannot omit delimiter
A::B // cannot duplicate delimiter
... // many other possible non-matches
My problem is that I don't know how to match the delimiter.
^(A)?:?(B)?$
, but then it will match AB
.^(A:)?(B)?$
, but then it will match A:
.^(A)|(B)|(C)|(D)|(A:B)|(A:C)...........$
). In the actual regex, A
and B
, etc. are more complex.^((A)|((A:)?((B)|((B:)?((C)|((C:)?(D)))))))$
, but it looks like a total mess to me still. I'll go with this option if I can't find a better answer though.You can make the colon optional but use word boundaries to require it between letters:
^\b(?::?A)?(?:\b:?B)?(?:\b:?C)?(?:\b:?D)?$
See this demo at regex101 - To allow empty matches, move first \b
inside leftmost group.
Or you make all characters optional and use a lookahead at start to validate the format:
^(?=\w(?::\w)*$)A?:?B?:?C?:?D?$
Another demo at regex101 - For empty matches change the lookahead to (?=(?:\b:?\w)*$)
.
Here a \w
is used which is a shorthand for word character and usually matches [A-Za-z0-9_]
.
A more general idea to accept empty matches is simply attaching |^$
at the end of the pattern (regex101 demo) or wrap the whole pattern into an ^(?:
optional group )?$
(regex101 demo).