promelaspin

Using Spin / Promela to model heartbeat protocol?


I'm new to Spin, and would like to know if Spin can be used, or has been used, to model a heartbeat protocol in the face of intermittent or failed networks.

One reason I'm struggling with this problem is that Spin abstracts the time dimension (only concepts are "before" and "after"), but heartbeats are inherently a time-related concept. In our current distributed system heartbeat design (that I'd like to use Spin to improve), something bad can happen when the network becomes flaky. Eventually, things straighten themselves out (the systems restart), but "eventually consistent" is not good enough in this case -- and Spin models seem to consider eventually consistent as not an error. In our case, clients have certain real-time response requirements.

I'm sorry this is such a fuzzy question; to summarize, are there any examples of heartbeat-related Spin models? Or any suggestions on how to model such systems?


Solution

  • TL;DR: You might want to look into Sphin or some other model-checker for Hybrid Real-Time Systems.


    It is possible, somehow, to design a system with a discretized clock signal using Promela and then express some LTL property over some internal variable keeping track the flow of time. However, I did this only for small toy examples and I would not advice to use it on larger projects.


    From the documentation of Promela (emphasis is mine):

    In the basic Promela language there is no mechanism for expressing properties of clocks or of time related properties or events. There are good algorithms for integrating real-time constraints into the model checking process, but most attention has so far been given to real-time verification problems in hardware circuit design, rather than the real-time verification of asynchronous software, which is the domain of the Spin model checker.

    The best known of these algorithms incur significant performance penalties compared with untimed verification. Each clock variable added to a model can increase the time and memory requirements of verification by an order of magnitude. Considering that one needs at least two or three such clock variables to define meaningful constraints, this seems to imply, for the time being, that a real-time capability requires at least three to four orders of magnitude more time and memory than the verification of the same system without time constraints.

    The good news is that if a correctness property can be proven for an untimed Promela model, it is guaranteed to preserve its correctness under all possible real-time constraints. The result is therefore robust, it can be obtained efficiently, and it encourages good design practice. In concurrent software design it is usually unwise to link logical correctness with real-time performance.

    Promela is a language for specifying systems of asynchronous processes. For the definition of such a system we abstract from the behavior of the process scheduler and from any assumption about the relative speed of execution of the various processes. These assumptions are safe, and the minimal assumptions required to allow us to construct proofs of correctness. The assumptions differ fundamentally from those that can be made for hardware systems, which are often driven by a single known clock, with relative speeds of execution precisely known. What often is just and safe in hardware verification is, therefore, not necessarily just and safe in software verification.

    Spin guarantees that all verification results remain valid independent of where and how processes are executed, timeshared on a single CPU, in true concurrency on a multiprocessor, or with different processes running on CPUs of different makes and varying speeds. Two points are worth considering in this context: first, such a guarantee can no longer be given if real-time constraints are introduced, and secondly, most of the existing real-time verification methods assume a true concurrency model, which inadvertently excludes the more common method of concurrent process execution by timesharing.

    It can be hard to define realistic time bounds for an abstract software system. Typically, little can be firmly known about the real-time performance of an implementation. It is generally unwise to rely on speculative information, when attempting to establish a system's critical correctness properties.