pythonnumpymatrixsetsymmetric-difference

convert a set of tuples into a numpy array of lists in python


So, I've been using the set method "symmetric_difference" between 2 ndarray matrices in the following way:

x_set = list(set(tuple(i) for i in x_spam_matrix.tolist()).symmetric_difference(
                 set(tuple(j) for j in partitioned_x[i].tolist())))

x = np.array([list(i) for i in x_set])

this method works fine for me, but it feel a little clumsy...is there anyway to conduct this in a slightly more elegant way?


Solution

  • A simple list of tuples:

    In [146]: alist = [(1,2),(3,4),(2,1),(3,4)]
    

    put it in a set:

    In [147]: aset = set(alist)
    In [148]: aset
    Out[148]: {(1, 2), (2, 1), (3, 4)}
    

    np.array just wraps that set in an object dtype:

    In [149]: np.array(aset)
    Out[149]: array({(1, 2), (3, 4), (2, 1)}, dtype=object)
    

    but make it into a list, and get a 2d array:

    In [150]: np.array(list(aset))
    Out[150]: 
    array([[1, 2],
           [3, 4],
           [2, 1]])
    

    Since it is a list of tuples, it can also be made into a structured array:

    In [151]: np.array(list(aset),'i,f')
    Out[151]: array([(1, 2.), (3, 4.), (2, 1.)], dtype=[('f0', '<i4'), ('f1', '<f4')])
    

    If the tuples varied in length, the list of tuples would be turned into a 1d array of tuples (object dtype):

    In [152]: np.array([(1,2),(3,4),(5,6,7)])
    Out[152]: array([(1, 2), (3, 4), (5, 6, 7)], dtype=object)
    In [153]: _.shape
    Out[153]: (3,)