I have a somewhat peculiar structure of python list of lists that I need to convert to a numpy array, so far I have managed to simply get by using np.array(myarray, dtype = object), however a seemingly insignificant change to the structure of myarray has caused me to get an error.
I have managed to reduce my issue down into two lines of code, the following is what I was using previously and works exactly how I want it to:
import numpy as np
myarray = [np.array([[1,2,3,4],[5,6,7,8]]), np.array([[9,10],[11,12]]), np.array([[13,14],[15,16],[17,18]])]
np.array(myarray,dtype = object)
However, simply removing the last [17,18] array we have
import numpy as np
myarray = [np.array([[1,2,3,4],[5,6,7,8]]), np.array([[9,10],[11,12]]), np.array([[13,14],[15,16]])]
np.array(myarray,dtype = object)
Which gives "ValueError: could not broadcast input array from shape (2,4) into shape (2,)" when it attempts to run the second line.
It seems to me that this only happens when the arrays all have the same length but the underlying lists have different lengths, what I don't understand is why setting dtype = object doesnt cover this especially considering it handles the more complicated list of lists shape.
np.array
tries, as first priority, to make a n-d numeric array - one where all elements are numeric, and the shape is consistent in all dimensions. i.e. no 'ragged' array.
In [36]: alist = [np.array([[1,2,3,4],[5,6,7,8]]),
np.array([[9,10],[11,12]]), np.array([[13,14],[15,16],[17,18]])]
In [38]: [a.shape for a in alist]
Out[38]: [(2, 4), (2, 2), (3, 2)]
alist
works making a 3 element array of arrays.
Your problem case:
In [39]: blist = [np.array([[1,2,3,4],[5,6,7,8]]), np.array([[9,10],[11,12]]), np.array([[13,14],[15,16]])]
In [40]: [a.shape for a in blist]
Out[40]: [(2, 4), (2, 2), (2, 2)]
Note that all subarrays have the same first dimension. That's what's giving the problem.
The safe way to make such an array is to start with a 'dummy' of the right shape, and fill it:
In [41]: res = np.empty(3,object); res[:] = blist; res
Out[41]:
array([array([[1, 2, 3, 4],
[5, 6, 7, 8]]), array([[ 9, 10],
[11, 12]]), array([[13, 14],
[15, 16]])],
dtype=object)
In [42]: res = np.empty(3,object); res[:] = alist; res
Out[42]:
array([array([[1, 2, 3, 4],
[5, 6, 7, 8]]), array([[ 9, 10],
[11, 12]]), array([[13, 14],
[15, 16],
[17, 18]])],
dtype=object)
It also works when all subarrays/lists have the same shape
In [43]: clist = [np.array([[1,2],[7,8]]), np.array([[9,10],[11,12]]), np.array([[13,14],[15,16]])]
In [44]: res = np.empty(3,object); res[:] = clist; res
Out[44]:
array([array([[1, 2],
[7, 8]]), array([[ 9, 10],
[11, 12]]), array([[13, 14],
[15, 16]])],
dtype=object)
Without that clist
produces a (3,2,2) array of number objects:
In [45]: np.array(clist, object)
Out[45]:
array([[[1, 2],
[7, 8]],
[[9, 10],
[11, 12]],
[[13, 14],
[15, 16]]], dtype=object)
One way to think of it, np.array
does not give you a way of specifying the 'depth' or 'shape' of object array. It has to 'guess', and in some cases guesses wrong.