I am trying to prove that particular implementations of how to calculate the edit distance between two strings are correct and yield identical results. I went with the most natural way to define edit distance recursively as a single function (see below). This caused coq to complain that it couldn't determine the decreasing argument. After some searching, it seems that using the Program Fixpoint mechanism and providing a measure function is one way around this problem. However, this led to the next problem that the tactic simpl no longer works as expected. I found this question which has a similar problem, but I am getting stuck because I don't understand the role the Fix_sub function is playing in the code generated by coq for my edit distance function which looks more complicated than in the simple example in the previous question.
Questions:
Is there an easier way using Program Fixpoint to get reductions?
Fixpoint min_helper (best :nat) (l : list nat) : nat :=
match l with
| nil => best
| h::t => if h<?best then min_helper h t else min_helper best t
end.
Program Fixpoint edit_distance (s1 s2 : string) {measure (length s1+ length s2)} : nat :=
match s1, s2 with
| EmptyString , EmptyString => O
| String char rest , EmptyString => length s1
| EmptyString , String char rest => length s2
| String char1 rest1 , String char2 rest2 =>
let choices : list nat := S ( edit_distance rest1 s2) :: S (edit_distance s1 rest2) :: nil in
if (Ascii.eqb char1 char2)
then min_helper (edit_distance rest1 rest2 ) choices
else min_helper (S (edit_distance rest1 rest2)) choices
end.
Next Obligation.
intros. simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. reflexivity. Qed.
Next Obligation.
simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. apply PeanoNat.Nat.le_succ_diag_r. Qed.
Next Obligation.
simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. apply PeanoNat.Nat.le_succ_diag_r. Qed.
Theorem simpl_edit : forall (s1 s2: string), edit_distance s1 s2 = match s1, s2 with
| EmptyString , EmptyString => O
| String char rest , EmptyString => length s1
| EmptyString , String char rest => length s2
| String char1 rest1 , String char2 rest2 =>
let choices : list nat := S ( edit_distance rest1 s2) :: S (edit_distance s1 rest2) :: nil in
if (Ascii.eqb char1 char2)
then min_helper (edit_distance rest1 rest2 ) choices
else min_helper (S (edit_distance rest1 rest2)) choices
end.
Proof. intros. induction s1.
- induction s2.
-- reflexivity.
-- reflexivity.
- induction s2.
-- reflexivity.
-- remember (a =? a0)%char as test. destruct test.
--- (*Stuck??? Normally I would unfold edit_distance but the definition coq creates after unfold edit_distance ; unfold edit_distance_func is hard for me to reason about*)
You can instead use Function
, which comes with Coq and produces a reduction lemma for you (this will actually also generate a graph as Inductive R_edit_distance
in the vein of the alternative development you mention, but here it's quite gnarly—that might be because of my edits for concision)
Require Import String.
Require Import List.
Require Import PeanoNat.
Import ListNotations.
Require Import FunInd.
Require Recdef.
Fixpoint min_helper (best : nat) (l : list nat) : nat :=
match l with
| [] => best
| h :: t => if h <? best then min_helper h t else min_helper best t
end.
Function edit_distance
(ss : string * string) (* unfortunately, Function only supports one decreasing argument *)
{measure (fun '(s1, s2) => String.length s1 + String.length s2) ss} : nat :=
match ss with
| (String char1 rest1 as s1, String char2 rest2 as s2) =>
let choices := [S (edit_distance (rest1, s2)); S (edit_distance (s1, rest2))] in
if Ascii.eqb char1 char2
then min_helper (edit_distance (rest1, rest2)) choices
else min_helper (S (edit_distance (rest1, rest2))) choices
| (EmptyString, s) | (s, EmptyString) => String.length s
end.
all: intros; simpl; rewrite Nat.add_succ_r; repeat constructor.
Qed.
Check edit_distance_equation. (* : forall ss : string * string, edit_distance ss = ... *)
Print R_edit_distance. (* Inductive R_edit_distance : string * string -> nat -> Prop := ... *)
The reason the graph Inductive
definition (either then nice one you cited or the messy one generated here) doesn't require assurances of termination is that terms of Inductive
type have to be finite already. A term of R_edit_distance ss n
(which represents edit_distance ss = n
) should be seen as a record or log of the steps in the computation of edit_distance
. Though a generally recursive function could possibly get stuck in an infinite computation, the corresponding Inductive
type excludes that infinite log: if edit_distance ss
were to diverge, R_edit_distance ss n
would simply be uninhabited for all n
, so nothing blows up. In turn, you don't have the ability to actually compute, given ss
, what edit_distance ss
is or a term in {n | R_edit_distance ss n}
, until you complete some termination proof. (E.g. proving forall ss, {n | R_edit_distance ss n}
is a form of termination proof for edit_distance
.)
Your idea to use structural recursion over some auxiliary type is exactly right (that's the only form of recursion that is available anyway; both Program
and Function
just build on it), but it doesn't really have anything to do with the graph inductive...
Fixpoint edit_distance
(s1 s2 : string) (n : nat) (prf : n = String.length s1 + String.length s2)
{struct n}
: nat := _.
Something along the lines of the above should work, but it'll be messy. (You could recurse over an instance of the graph inductive instead of the nat
here, but, again, that just kicks the bucket to building instances of the graph inductive.)