I have a simple requirement. We use the hibernate validation engine to figure out if a constraint is true or false.
True should be a text if all the words starts with an uppercase character. There are some difficulties:
Words could also start like this
8-Test or even 8Test or even (Test) or even -Test or anything comparableAlso usually they are comma separated (or a different separator):
Test, Test, TestRemember I only want to make sure that words in the String starts uppercase. When you see my tries, probably I am overcomplicating things.
Here are some samples: Expected to match all (true):
- Hydroxyisohexyl 3-Cyclohexene Carboxaldehyde, Benzyl - Test, Test, Test - CI 15510, Methylchloroisothiazolinone, Disodium EDTA - N/A - NAExpected to not match all (false):
- hydroxyisohexyl 3-Cyclohexene Carboxaldehyde, Benzyl - Test, test, Test - CI 15510, Methylchloroisothiazolinone, Disodium eDTA - na - n/aMy tries were going into this directions:
final String oldregex = "([\\W]*\\b[A-Z\\d]\\w+\\b[\\W]*)+";
final String regex = "([A-Z][\\d\\w]+( [A-Z][-\\d\\w]+)*, )*[A-Z][-\\d\\w]+( [A-Z][-\\d\\w]+)*\\.";'
actually with "oldregex" option I ran into an infinitive calculation for some texts
Use this to test regex: http://gskinner.com/RegExr/ (without double backslashes of course)
Thanks for helping!!!
See it in action:
^(?:[^A-Za-z]*[A-Z][^\s,]*)*[^A-Za-z]*$
^ # start of the string
(?: # this group matches a "word", don't capture the group
[^A-Za-z]* # skip any non-alphabet characters at start of the word
[A-Z] # force an uppercase letter as a first letter
[^\s,]* # match anything but word separators (\s and ,) after 1th letter
)* # the whole line consists of such "words"
[^A-Za-z]* # skip any non-alphabet characters at the end of the string
$ # end of the string
Note: You can modify the regex if your word separator characters different then whitespace and comma. (For example, change [^\s,] to [^,:-] or whatever you use)