arraysz3smtz3pysmt-lib

Theory of arrays in Z3: (1) model is difficult to understand, (2) do not know how to implement functions and (3) difference with sequences


Following to the question published in How expressive can we be with arrays in Z3(Py)? An example, I expressed the following formula in Z3Py:

Exists i::Integer s.t. (0<=i<|arr|) & (avg(arr)+t<arr[i])

This means: whether there is a position i::0<i<|arr| in the array whose value a[i] is greater than the average of the array avg(arr) plus a given threshold t.

The solution in Z3Py:

t = Int('t')
avg_arr = Int('avg_arr')
len_arr = Int('len_arr')

arr = Array('arr', IntSort(), IntSort())

phi_1 = And(0 <= i, i< len_arr)
phi_2 = (t+avg_arr<arr[i])

phi = Exists(i, And(phi_1, phi_2))

s = Solver()
s.add(phi)
print(s.check())
print(s.model())

Note that, (1) the formula is satisfiable and (2) each time I execute it, I get a different model. For instance, I just got: [avg_a = 0, t = 7718, len_arr = 1, arr = K(Int, 7719)].

I have three questions now:

  1. What does arr = K(Int, 7719)] mean? Does this mean the array contains one Int element with value 7719? In that case, what does the K mean?
  2. Of course, this implementation is wrong in the sense that the average and length values are independent from the array itself. How can I implement simple avg and len functions?
  3. Where is the i index in the model given by the solver?

Also, in which sense would this implementation be different using sequences instead of arrays?


Solution

  • (1) arr = K(Int, 7719) means that it's a constant array. That is, at every location it has the value 7719. Note that this is truly "at every location," i.e., at every integer value. There's no "size" of the array in SMTLib parlance. For that, use sequences.

    (2) Indeed, your average/length etc are not related at all to the array. There are ways of modeling this using quantifiers, but I'd recommend staying away from that. They are brittle, hard to code and maintain, and furthermore any interesting theorem you want to prove will get an unknown as answer.

    (3) The i you declared and the i you used as the existential is completely independent of each other. (Latter is just a trick so z3 can recognize it as a value.) But I guess you removed that now.

    The proper way to model such problems is using sequences. (Although, you shouldn't expect much proof performance there either.) Start here: https://microsoft.github.io/z3guide/docs/theories/Sequences/ and see how much you can push it through. Functions like avg will need a recursive definition most likely, for that you can use RecAddDefinition, for an example see: https://stackoverflow.com/a/68457868/936310

    Stack-overflow works the best when you try to code these yourself and ask very specific questions about how to proceed, as opposed to overarching questions. (But you already knew that!) Best of luck..