linked-listdafny

Inserting a node at a random position in a linked list: Dafny


After several attempts and discussions on SO and Dafny discussion board, the problem is still not solved.

Primer: After implementing a simple append operation following Rustan Lieno's paper, I decided to see if I understood the concepts right and tried to insert a node at a given position instead of appending. The idea is to prove that an imperative C-code (1) terminates and (2) preserves the validity of the queue. The termination was proven quite easily, thanks to this post.

The problem is, when a node is added to the queue, the footprints of all the preceding nodes must be updated. In Lieno's paper, this is achieved using a simple forall loop. This is sufficient because the last node is reachable from all the previous nodes. In case of inserting the node in the middle, we have to update the footprints of all the previous nodes too. My code below tries to do that with a similar forall loop.

After the discussion on Dafny discussion board, it was suggested that due to information loss, it is better that I use a sequence as my representation but the attempt is not verifying. I paused my work on that code for a day or two to return to Lieno's paper and use the predicates in that paper. My attempt is given below:

class Node {
    var data: int
    var next: Node?

    ghost var footprint: set<object>

    ghost predicate valid()
        reads this, footprint
    {
        this in footprint &&
        (next != null ==> next in footprint && next.footprint <= footprint && 
            this !in next.footprint)
    }

    constructor(d: int)
        modifies this
        ensures valid() && fresh(footprint - {this})
        ensures next == null && data == d
    {
        data := d;
        next := null;
        footprint := {this};
    }
}

class Queue {
    var head: Node

    ghost var footprint: set<object>
    ghost var spine: set<Node>

    /* opaque */ ghost predicate valid()
        reads this, footprint
    {
        this in footprint && spine <= footprint &&
        head != null && head in spine &&
        (forall n: Node? | n in spine :: n != null && n.footprint <= footprint && n.valid()) &&
        (forall n | n in spine :: n.next != null ==> n.next in spine)
    }

    constructor()
        modifies this
        ensures valid() && fresh (footprint - {this})
    {
        /* reveal valid(); */
        var n := new Node(0);
        head := n;
        footprint := {this} + n.footprint;
        spine := {n};
    }

    method {:vcs_split_on_every_assert} {:rlimit 200} enqueue (d: int, pos: nat)
        requires valid()
        requires 0 <= pos < |spine|
        modifies footprint
        ensures valid() 
        ensures fresh(footprint - old(footprint))
    {
        var curr: Node := head;
        var i: nat := 0;
        /* reveal valid(); */

        while(i < pos && curr.next != null)
            invariant curr in spine
            invariant curr.footprint <= footprint
            invariant curr.next != null ==> curr.next in spine && curr !in curr.next.footprint &&
                curr.next.footprint <= curr.footprint
            invariant curr.valid()
            invariant valid()
            decreases if curr != null then curr.footprint else {}
        {
            curr := curr.next;
            i := i + 1;
        }

        assert /* {:only} */ forall m: Node | m in spine && m in curr.footprint :: m.valid();

        var node: Node := new Node(d);
        node.next := curr.next;
        node.footprint := node.footprint + curr.footprint - {curr};
        spine := spine + {node};
        curr.next := node;
        curr.footprint := curr.footprint + node.footprint;
        assert curr.valid();
        assert node.valid();

        forall m | m in spine && m !in curr.footprint {
            /* we update all those nodes that are not a part of the footprint of the current node. 
            As per the updates above, these nodes are all those nodes which current node cannot reach
            i.e., the preceding nodes. */
            m.footprint := m.footprint + curr.footprint;
        }

        footprint := footprint + node.footprint;

        /* assert forall m | m in spine && m !in node.footprint :: m.valid(); */
    }
}

Now, cure.valid() holds. But node.valid() fails. The forall is used to update the footprints of all the previous nodes. The understanding is that the footprint of all those nodes which are not in the current nodes's footprint, will be updated. I think, updating a single node in the spine destroys the validity of all the nodes in the spine.

So I add another forall loop like:

forall m | m in spine && m !in curr.footprint && m.next != null  {
    m.footprint := m.footprint + m.next.footprint;
}

but to no avail.

I think a lemma stating what is changed and what is not changed in the spine maybe required but I cannot frame such a lemma.

Any help to solve this problem is greatly appreciated!


Solution

  • So finally, the solution to this problem came from several unrelated comments and problems on SO. I'll explain the sequence of events here for my own reference and for the community.

    The problem was inserting a node at a specific position. The problem was verification of a family of functions in a particular software written in C. The functions deal with a priority queue. So inserting an element in a given position verifies (at least) the termination of the while loops in the family. So I tried several approaches but none of them worked.

    The problem with all the approaches was that after inserting a node, the footprints of all the preceding nodes must be updated. @DivyanshuRanjan solved it by collecting the nodes in a while loop and then tracking back all the way to the head node while updating the footprint. The solution worked except for an assumption.

    I then tried redefining my footprint as a sequence of objects rather than a set. The reasoning was that with a set, we cannot index but with a sequence we can. That solution was not pursued because it led to the same problem.

    Then, I found an implementation of an AVL-Tree. The author(s) defined a recursive class invariant for the nodes of the tree. The validity of the tree is then dependant only on the validity of the root node. I took this idea and defined the class invariant of the nodes of the queue the same way.

    ghost predicate valid()
        reads this, footprint
        decreases footprint + {this}
    {
        this in footprint &&
        (next != null ==> next in footprint &&
            next.footprint <= footprint &&
            this !in next.footprint &&
            next.valid() &&
            (forall y :: y in next.model ==> y <= prio)) &&
        (next == null ==> model == {prio}) &&
        (next != null ==> model == next.model + {prio})
    }
    

    The class invariant of the queue is then dependant only on the head node.

    ghost predicate valid()
        reads this, head, footprint
    {
        this in footprint &&
        (head == null ==> model == {}) &&
        (head != null ==>
            head in footprint &&
            head.footprint <= footprint &&
            this !in head.footprint &&
            head.valid() &&
            model == head.model)
    }
    

    Again, I tried to iterate through the nodes and then got stuck again with the same problem. Until I read this question on SO by @Hath995. A comment by @JamesWilcox suggested that a recursive class invariant is inherently biased towards recursive traversals. For now, I assumed it to be true and then implemented a recursive method to insert a node in a sorted priority queue:

    method recurse(node: Node?, p: nat) returns (newNode: Node)
        requires head != null ==> head.valid()
        requires node == null || node.valid()
        modifies if node != null then node.footprint + {node} else {}
        ensures newNode.valid()
        ensures node == null ==> fresh(newNode.footprint) && newNode.model == {p}
        ensures node != null ==> newNode.model == {p} + old(node.model)
        ensures node != null ==> fresh(newNode.footprint - old(node.footprint))
        ensures newNode == head ==> head.valid()
        decreases if node != null then node.footprint else {}
    {
        if node == null || (node != null && node.prio < p) {
            newNode := new Node(p);
            newNode.next := node;
            
            /* ***** GHOST UPDATES ***** */
            assert newNode.prio == p;
            if node != null {
                newNode.footprint := newNode.footprint + node.footprint;
                newNode.model := {p} + node.model;
            }
    
            assert newNode.next != null ==> forall y | y in newNode.next.model :: y < newNode.prio;
            assert newNode.valid();
        } else {
            assert node.next == null || node.next.valid();
            node.next := recurse(node.next, p);
    
            /* ***** GHOST UPDATES ***** */
            assert node.next.valid();
            assert node.prio >= node.next.prio;
            node.model := node.model + {p};
            assert forall y :: y in node.next.model ==> y <= node.prio;
            node.footprint := node.footprint + node.next.footprint;
            assert node.valid() && node.next.valid();
    
            newNode := node;
            assert newNode == head ==> head.valid();
        }
    }
    

    Translating an iterative C code to recursive Dafny also called for understanding how the loop invariants are changed to method post-conditions and pre-conditions. Although the pre- and post-conditions are similar to the ones in the AVL-Tree implementation, I argued the fact that a queue is a special case of a tree where all the nodes are added to one branch (left heavy or right heavy tree). Therefore, the method pre- and post-conditions are, more or less, the same.

    Still, the modifies clause complained and this question on SO clarified the doubt. The problem was, in my initial modifies clause, I was only specifying what the top level of recursion could change. The new modifies clause fixes the problem by specifying what the recursion is allowed to modify in the subsequent calls. Which solves the problem.

    The final question that I have is, as per @JamesWilcox's comment here, recursive valid predicates are inherently biased towards recursive traversal. @Hath995 provided an explanation saying that if a method is recursing in the same manner as the valid predicate, then each call site is dealt with the same local information and validity. It makes sense but I need some clarification for the same.