Can someone explain this to me? I've been reading about it and it still is hard to follow.
text : ababdbaababa
pattern: ababa
table for ababa is -1 0 0 1 2.
I think I understand how the table is constructed but, I dont understand how to shift once mismatch has occurred. Seems like we dont even use the table when shifting?
when do we use the table?
The table is used when your mismatch occurs. Let's apply the pattern to your text:
You start matching text with pattern and test if your pattern could be in text, starting at the first position. You compare text[1]
with pattern[1]
and that turns out to be a match. You do the same for text[2]
, text[3]
and text[4]
.
When you want to match text[5]
with pattern[5]
you don't have a match (d
<>a
). You then know that your pattern will not start at the first position. You could then start the matching all over again for position 2 but that is not efficient. You can use the table now.
The error occured at pattern[5]
so you go to table[5]
which is 2. That tells you that you can start matching at the current position again with 2 already matched characters. Instead of having to start matching position 2, you can start at your previous position (1) + table[5]
(2)=3. Indeed, If we look at text[3]
and text[4]
, we see that it is equal to pattern[1]
and pattern[2]
, respectivily.
The numbers in table tell you how many positions are already matched when an error occurs. In this case 2 characters of the next pattern were already matched. You can then immediately start matching for position 3 and skip position 2 (as the pattern can not be found starting at position[2]).