This is the code to calculate the failure function (how many steps we have to go back) in Scheme, when we use the Knuth-Morris-Pratt algorithm:
(define (compute-failure-function p)
(define n-p (string-length p))
(define sigma-table (make-vector n-p 0))
(let loop
((i-p 2)
(k 0))
(cond
((>= i-p n-p)
(vector-set! sigma-table (- n-p 1) k))
((eq? (string-ref p k)
(string-ref p (- i-p 1)))
(vector-set! sigma-table i-p (+ k 1))
(loop (+ i-p 1) (+ k 1)))
((> k 0)
(loop i-p (vector-ref sigma-table k)))
(else ; k=0
(vector-set! sigma-table i-p 0)
(loop (+ i-p 1) k))))
(vector-set! sigma-table 0 -1)
(lambda (q)
(vector-ref sigma-table q)))
But I do not understand the part when k > 0
. Can someone explain it please?
I see you're confused with the syntax of a named let
. This post does a good job explaining how it works, but perhaps an example with more familiar syntax will make things clearer. Take this code in Python, it adds all integers from 1 to 10:
sum = 0
n = 1
while n <= 10:
sum += n
n += 1
print(sum)
=> 55
Now let's try to write it in a recursive fashion, I'll call my function loop
. This is completely equivalent:
def loop(n, sum):
if n > 10:
return sum
else:
return loop(n + 1, n + sum)
loop(1, 0)
=> 55
In the above example, the loop
function implements an iteration, the parameter n
is used to keep track of the current position, and the parameter sum
accumulates the answer. Now let's write the exact same code, but in Scheme:
(let loop ((n 1) (sum 0))
(cond ((> n 10) sum)
(else (loop (+ n 1) (+ n sum)))))
=> 55
Now we've defined a local procedure called loop
which is then automatically called with the initial values 1
and 0
for its parameters n
and sum
. When the base case of the recursion is reached, we return sum
, otherwise we keep calling this procedure, passing updated values for the parameters. It's exactly the same as in the Python code! Don't be confused by the syntax.
In your algorithm, i-p
and k
are the iteration variables, which are initialized to 2
and 0
respectively. Depending on which condition is true, the iteration continues when we call loop
again with updated values for i-p
and k
, or it ends when the case (>= i-p n-p)
is reached, at this point the loop exits and the computed value is in the variable sigma-table
. The procedure ends by returning a new function, referred to as the "failure function".