javaalgorithmstring-matchingaho-corasickdawg

Using Aho-Corasick on a DAWG rather than a Trie


does anybody know if it's possible to modify the Aho-Corasick string matching algorithm to be used on a DAWG (Directed Acyclic Word Graph) rather than a Trie?


Solution

  • The trie in the Aho-Corasick algorithm is not a simple trie of the words but contains additional transitions for the failure function (where do you continue after a mismatch). There is a algorithm called multiBDM that uses both a trie and a DAWG. You can find details and other automata based approaches in chapter 7 of the book: M. Crochemore and W. Rytter, Text Algorithms, Oxford University Press, New York, 1994. You can find more info about it here.