I'm trying to implement Levenshtein distance in Prolog.
The implementation is pretty straightforward:
levenshtein(W1, W2, D) :-
atom_length(W1, L1),
atom_length(W2, L2),
lev(W1, W2, L1, L2, D),
!.
lev(_, _, L1, 0, D) :- D is L1, !.
lev(_, _, 0, L2, D) :- D is L2, !.
lev(W1, W2, L1, L2, D) :-
lev(W1, W2, L1 - 1, L2, D1),
lev(W1, W2, L1, L2 - 1, D2),
lev(W1, W2, L1 - 1, L2 - 1, D3),
charAt(W1, L1, C1),
charAt(W2, L2, C2),
( C1 = C2 -> T is 0; T is 1 ),
min(D1, D2, D3 + T, D).
% Returns the character at position N in the atom A
% The position is 1-based
% A: The atom
% N: The position at which to extract the character
% C: The character of A at position N
charAt(A, N, C) :- P is N - 1, sub_atom(A, P, 1, _, C).
% min(...): These rules compute the minimum of the given integer values
% I1, I2, I3: Integer values
% M: The minimum over the values
min(I1, I2, M) :- integer(I1), integer(I2), ( I1 =< I2 -> M is I1; M is I2).
min(I1, I2, I3, M) :- min(I1, I2, A), min(I2, I3, B), min(A, B, M).
However, this code failures with this error:
?- levenshtein("poka", "po", X).
ERROR: Out of local stack
I'm using SWIPL
implementation on Mac OS X Sierra
.
There is a good reason for which your program does not work: your recursive calls lead into an infinite loop.
This is caused by those lines:
lev(W1, W2, L1 - 1, L2, D1),
lev(W1, W2, L1, L2 - 1, D2),
lev(W1, W2, L1 - 1, L2 - 1, D3),
min(D1, D2, D3 + T, D)
In Prolog things like L1 - 1
are expressions that do not get evaluated to numbers. Therefore your code will recursively call lev
with the third argument as L1 -1
, then L1 - 1 - 1
, etc. which does not match your terminating rules.
To fix this you need to use temporary variables where you evaluate the result of e.g. L1 - 1
.
This fixes it:
lev(W1, W2, L1, L2, D) :- L11 is L1 - 1, L22 is L2 - 1, lev(W1, W2, L11, L2, D1), lev(W1, W2, L1, L22, D2), lev(W1, W2, L11, L22, D3), charAt(W1, L1, C1), charAt(W2, L2, C2), ( C1 = C2 -> T is 0; T is 1 ), D4 is D3 + T, min(D1, D2, D4, D).
Now this does this:
?- levenshtein("poka","po",X).
X = 0.
Which is probably not the result you want, but at least it does not error. I will leave it to you to fix your predicate.