I used the python package ahocorasick(https://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/) for text matching for the state name here:
import ahocorasick
states = {
'AK': 'Alaska',
'AL': 'Alabama',
'AR': 'Arkansas',
'AS': 'American Samoa',
'AZ': 'Arizona',
'CA': 'California',
'CO': 'Colorado',
'CT': 'Connecticut'
}
def LoadKeywords(keywords):
#Keyword should be a list
tree = ahocorasick.KeywordTree()
for k in keywords:
tree.add(k)
tree.make()
return tree
keywordLong = states.values();
keywordLongTree = LoadKeywords(keywordLong);
Then I try to do search
keywordLongTree.search("Alabama")
it returns
(0, 7)
Which is fine and legitimate, but when I do
keywordLongTree.search("I don't know why this happen")
it should returns a NONE object but it returns:
(145331, 145335)
Has someone faces this situation before? why this happen?
I encountered exactly the same issue. It should be the defect of the module. After all, it hasn't been modified since 2005. I used https://code.google.com/p/esmre/ instead. It worked find. Give it a trial!