What is the approximate maximum state space size of modern model checkers, like NuSMV. I do not need an exact number but some state size value, where the run time is still acceptable (say a few weeks).
What kind of improvements, beyond symbolic model checking, are used to raise that limit?
The answer varies wildly, depending, among other factors, on:
Instead of mentioning some specific number of states, I will instead note a few relevant factors (I use "specification" below as a synonym to "model"):
Symbolic or enumerative: Symbolic algorithms scale differently than enumerative ones. Also, for the same problem, there are typically differences in the computational complexity of known symbolic and enumerative algorithms.
Enumeration is relatively predictable in behavior, in that a state space with N
states will most likely take a shorter time to enumerate than a state space with 1000000 * N
states.
Symbolic approaches based on binary-decision diagrams (BDDs) can behave in ways (nearly) unrelated to how many states are reachable based on the specification. The main factor is what kind of Boolean functions arise in the encoding of the specification.
For example, a specification that involves a multiplier would result in BDDs that are exponentially large in the number of bits that represent the state, so of size linear in the number of states (assuming that the reachable states are exponentially more than than the bits used to represent those states). In this case, a state space with 2^50
states that may otherwise be amenable to symbolic analysis becomes prohibitive.
In other words, it is not only the number of states, but the kind of Boolean function that the system action corresponds to that matters (an action in TLA+ corresponds to what a transition relation represents in other formalisms). In addition, choosing a different encoding (of integers by bits) can have an impact on BDD size.
Symmetry (e.g., partial order reduction), and abstraction are some improvements to approach the analysis of more complex systems.
Acceptable runtime is a relative notion. Whatever the model checking approach, there is always a limit where the model's fidelity reaches the available time.
Another approach is to write a specification that has unspecified parameters, then use a model checker to find errors in instances of the specification that correspond to small parameter values, and after correcting these, then use a theorem prover to ensure correctness of the specification. This approach is supported by tools for TLA+, namely the model checker TLC and the theorem prover TLAPS.
Regarding terminology ("specification" above), please see "What good is temporal logic?" by Leslie Lamport.
Also worth noting that, depending on the approach, the number of states, and the number of reachable states can be different notions. Usually, this matters in typed formalisms: we can specify a system with 1 reachable state, but declare variable types that result in many more states, most of which are unreachable from the initial conditions. In symbolic approaches, this affects the encoding, thus the size of BDDs.
References relevant to state space size:
Bwolen Yang, Randal E. Bryant, David R. O’Hallaron, Armin Biere, Olivier Coudert, Geert Janssen, Rajeev K. Ranjan, Fabio Somenzi, A performance study of BDD-based model checking, FMCAD, 1998, DOI: 10.1007/3-540-49519-3_18
Radek Pelánek, Properties of state spaces and their applications, STTT, 2008, DOI: 10.1007/s10009-008-0070-5 (and a relevant website)
Radek Pelánek, Typical structural properties of state spaces, SPIN, 2004, DOI: 10.1007/978-3-540-24732-6_2
Yunja Choi, From NuSMV to SPIN: Experiences with model checking flight guidance systems, FMSD, 2007, DOI: 10.1007/s10703-006-0027-9
J.R. Burch, E.M. Clarke, K.L. McMillan, D.L. Dill, L.J. Hwang, Symbolic model checking: 10^20 states and beyond, LICS, 1990, DOI: 10.1109/LICS.1990.113767