pythonnumpyfloating-pointequalityapproximate

Approximate equality of unordered set of complex floats


I want to do nose testing of numpy arrays of complex floats that are unordered.

So for instance, if

a = [1+1j, 1-1j, 2+2j, 2-2j, 2+2j, 2-2j]

and

b = [2+2j, 2-2j, 1+1j, 1.000000000000001-1j, 2+2j, 2-2j]

the assert should succeed, as they have approximately the same values the same number of times. Order doesn't matter.

For regular floats, assert_array_almost_equal(np.sort(a), np.sort(b)) would be fine, but that doesn't work here because it sorts by real part first and then imaginary part, so because of the float error, they are sorted to:

a: [ 1.-1.j,  1.+1.j,  2.-2.j,  2.-2.j,  2.+2.j,  2.+2.j]    
b: [ 1.+1.j,  1.-1.j,  2.-2.j,  2.-2.j,  2.+2.j,  2.+2.j]

Is there a built-in way to do this? If not, I guess I could resort to something like cplxreal, but that seems like a lot to add to a testing file.


Solution

  • I encountered this problem recently and realized that it can be accomplished quite efficiently by checking whether the structural rank of the matrix of pairwise comparisons is equal to the size of the inputs.

    According to scipy.csgraph.structural_rank "A graph has full structural rank if it is possible to permute the elements to make the diagonal zero-free." This is exactly the condition we want: we want to check if the pairwise comparison can be permuted such that the result has a fully nonzero diagonal.

    You can implement it like this:

    import numpy as np
    from scipy.sparse import csgraph, csr_matrix
    
    def sets_approx_equal(x, y, **kwds):
      """
      Check if x and y are permutations of the same approximate set of values.
      """
      x = np.asarray(x)
      y = np.asarray(y)
      assert x.ndim == y.ndim == 1
      if len(x) != len(y):
        return False
      mat = np.isclose(x[:, None], y[None, :], **kwds)
      rank = csgraph.structural_rank(csr_matrix(mat))
      return rank == len(y)
    

    Testing on your example inputs gives the expected results:

    a = [1+1j, 1-1j, 2+2j, 2-2j, 2+2j, 2-2j]
    b = [2+2j, 2-2j, 1+1j, 1.000000000000001-1j, 2+2j, 2-2j]
    print(sets_approx_equal(a, b))
    # True
    
    b[-1] = 1 - 1j
    print(sets_approx_equal(a, b))
    # False
    

    To convince ourselves further that this is working, we can generate a large number of known equal and unequal permutations, and check that the function is returning the expected result for these cases:

    def make_permutations(rng, N, equal=True, eps=1E-10):
      """Make two permuted arrays of complex values.
    
      If equal=True, they are derived from the same initial set of values.
      If equal=False, they are not
      """
      x = rng.randn(N) + 1j * rng.randn(N)
      y = x if equal else rng.randn(N) + 1j * rng.randn(N)
      x = x + eps * (rng.randn(N) + 1j * rng.randn(N))
      y = y + eps * (rng.randn(N) + 1j * rng.randn(N))
      return rng.permutation(x), rng.permutation(y)
    
    rng = np.random.RandomState(0)
    for i in range(1000):
      x, y = make_permutations(rng, 5, equal=True)
      assert sets_approx_equal(x, y)
    
      x, y = make_permutations(rng, 5, equal=False)
      assert not sets_approx_equal(x, y)
    

    Indeed, all the assertions pass.