I wanted to retrieve dates and other temporal entities from a set of Strings. Can this be done without parsing the string for dates in JAVA as most parsers deal with a limited scope of input patterns. But input is a manual entry which here and hence ambiguous.
Inputs can be like:
12th Sep |mid-March |12.September.2013
Sep 12th |12th September| 2013
Sept 13 |12th, September |12th,Feb,2013
I've gone through many answers on finding date in Java but most of them don't deal with such a huge scope of input patterns.
I've tried using SimpleDateFormat
class and using some parse() functions to check if parse function breaks which mean its not a date. I've tried using regex
but I'm not sure if it falls fit in this scenario. I've also used ClearNLP to annotate the dates but it doesn't give a reliable annotation set.
The closest approach to getting these values could be using a Chain of responsibility
as mentioned below. Is there a library that has a set of patterns for date. I can use that maybe?
Yes! I've finally extracted all sorts of dates/temporal values that can be as generic as :
mid-March | Last Month | 9/11
To as specific as:
11/11/11 11:11:11
This finally happened because of awesome libraries from GATE and JAPE
I've created a more lenient annotation rule in JAPE say 'DateEnhanced' to include certain kinds of dates like "9/11 or 11TH, February- 2001" and used a Chaining of Java regex on R.H.S. of the 'DateEnhanced' annotations JAPE RULE
, to filter some unwanted outputs.