algorithm prolog unification iso-prolog occurs-check

Unification with STO detection

In ISO Prolog unification is defined only for those cases that are NSTO (not subject to occurs-check). The idea behind is to cover those cases of unifications that are mostly used in programs and that are actually supported by all Prolog systems. More specifically, ISO/IEC 13211-1:1995 reads:

7.3.3 Subject to occurs-check (STO) and not subject
to occurs-check (NSTO)

A set of equations (or two terms) is "subject to occurs-
check" (STO) iff there exists a way to proceed through
the steps of the Herbrand Algorithm such that 7.3.2 g
happens.

A set of equations (or two terms) is "not subject to
occurs-check" (NSTO) iff there exists no way to proceed
through the steps of the Herbrand Algorithm such that
7.3.2 g happens.

...

This step 7.3.2 g reads:

g) If there is an equation of the form X = t such
that X is a variable and t is a non-variable term
which contains this variable, then exit with failure (not
unifiable, positive occurs-check).

The complete algorithm is called Herbrand Algorithm and is what is commonly known as the Martelli-Montanari unification algorithm - which essentially proceeds by rewriting sets of equations in a non-deterministic manner.

Note that new equations are introduced with:

d) If there is an equation of the form f(a₁,a₂, ...a_N) =
f(b₁,b₂, ...b_N) then replace it by the set of equations
a_i = b_i.

Which means that two compound terms with the same functor but different arity will never contribute to STO-ness.

This non-determinism is what makes the STO-test so difficult to implement. After all, it is not sufficient to test whether or not an occurs-check might be necessary, but to prove that for all possible ways to execute the algorithm this case will never happen.

Here is a case to illustrate the situation:

?- A/B+C*D = 1/2+3*4.

Unification might start by A = 1, but also any of the other pairs, and continue in any order. To ensure the NSTO property, it must be ensured that there is no path that might produce a STO situation. Consider a case that is unproblematic for current implementations, but that is still STO:

?- 1+A = 2+s(A).

Prolog systems start by rewriting this equation into:

?- 1 = 2, A = s(A).

Now, they pick either

1 = 2 which makes the algorithm exit with failure, or
A = s(A) where step g applies and STO-ness is detected.

My question is twofold. First it is about an implementation in ISO Prolog of unify_sto(X,Y) (using only the defined built-ins of Part 1) for which the following holds:

if the unification is STO, then unify_sto(X,Y) produces an error, otherwise
if unify_sto(X,Y) succeeds then also X = Y succeeds
if unify_sto(X,Y) fails then also X = Y fails

and my second question is about the specific error to issue in this situation. See ISO's error classes.

Here is a simple step to start with: All success cases are covered by the success of unify_with_occurs_check(X,Y). What remains to do is the distinction between NSTO failure and STO error cases. That's were things start to become difficult...

Solution

Third attempt. This is mainly a bugfix in a previous answer (which already had many modifications). Edit: 06/04/2015

When creating a more general term I was leaving both subterms as-is if either of them was a variable. Now I build a more general term for the "other" subterm in this case, by calling term_general/2.

unify_sto(X,Y):-
  unify_with_occurs_check(X,Y) -> true ;
  (
   term_general(X, Y, unify(X,Y), XG, YG),
   \+unify_with_occurs_check(XG,YG),
   throw(error(type_error(acyclic, unify(X,Y)),_))
  ).

term_general(X, Y, UnifyTerm, XG, YG):-
  (var(X) -> (XG=X, term_general(Y, YG)) ;
  (var(Y) -> (YG=Y, term_general(X, XG)) ;
  ((
    functor(X, Functor, Len),
    functor(Y, Functor, Len),
    X=..[_|XL],
    Y=..[_|YL],
    term_general1(XL, YL, UnifyTerm, NXL, NYL)
  ) ->
  (
    XG=..[Functor|NXL],
    YG=..[Functor|NYL]
  ) ;
  ( XG=_, YG=_ )
  ))).

term_general1([X|XTail], [Y|YTail], UnifyTerm, [XG|XGTail], [YG|YGTail]):-
  term_general(X, Y, UnifyTerm, XG, YG),
  (
    \+(unify_with_occurs_check(XG,YG)) ->
        throw(error(type_error(acyclic,UnifyTerm),_)) ;
        term_general1(XTail, YTail, UnifyTerm, XGTail, YGTail)
  ).
term_general1([], [], _, [], []).

term_general(X, XG):-
  (var(X) -> XG=X ;
  (atomic(X) -> XG=_ ;
  (
     X=..[_|XL],
     term_general1(XL, XG)
  ))).

term_general1([X|XTail], [XG|XGTail]):-
  term_general(X, XG),
  term_general1(XTail, XGTail).
term_general1([], _).

And here the unit tests so far mentioned in this question:

unit_tests:-
  member([TermA,TermB], [[_A+_B,_C+_D], [_E+_F, 1+2],
                         [a(_G+1),a(1+_H)], [a(1), b(_I)],
                         [A+A,a(B)+b(B)], [A+A,a(B,1)+b(B)]]),
  (unify_sto(TermA, TermB)->Unifies=unifies ; Unifies=does_not_unify),
  writeln(test(TermA, TermB, Unifies)),
  fail.
unit_tests:-
     member([TermA,TermB], [[A+A,B+a(B)], [A+A,A+b(A)],
                            [A+A,a(_)+b(A)], [1+A,2+s(A)],
                            [a(1)+X,b(1)+s(X)]]),
  catch(
   (
     (unify_sto(TermA, TermB)->true;true),
     writeln(test_failed(TermA, TermB))
   ), E, writeln(test_ok(E))),
   fail.
unit_tests.

Unification with STO detection

7.3.3 Subject to occurs-check (STO) and not subjectto occurs-check (NSTO)

7.3.3 Subject to occurs-check (STO) and not subject
to occurs-check (NSTO)