I want to solve UVA 10298 -"Power Strings" problem using KMP algorithm. In this blog a technique is shown how failure function can be used to calculate minimum length repeated substring. The technique is as follows:
pi[ ]
for the given string.len
be the string length, and last_in_pi
be the value stored at the last index of pi
table.len % (len - last_in_pi) == 0
is true or not. If it is true then the length of the minimum length repeated substring is (len - last_in_pi)
, otherwise it is the length of the given string.I understand what is failure function and how it is used to find pattern in a text but I am struggling to understand proof of correctness of this technique.
Remember that Pi[i]
is defined as the (length of the) longest prefix of your_string
that is a proper suffix (so not the whole string) of the substring your_string[0 ... i]
.
There is an example on the blog post you linked to:
0 1 2 3 4 5
S : a b a b a b
Pi: 0 0 1 2 3 4
Where we have:
a b a
a b a b
Etc. I hope this makes it clear what Pi
(the prefix function / table) does.
Now, the blog says:
The last value of prefix table = 4.. Now If it is a repeated string than , It’s minimal length would be 2. (6(string length) – 4) , Now
So you have to check if len % (len - last_in_pi) == 0
. If yes, then len - last_in_pi
is the length of the shortest repeated string (the period string).
This works because, if you rotate a string with len(period)
positions either way, it will match itself. len - last_in_pi
tells you how much you'd need to rotate.