Rationale for numpy.split returning a list and not an array

I was surprised that numpy.split yields a list and not an array. I would have thought it would be better to return an array, since numpy has put a lot of work into making arrays more useful than lists. Can anyone justify numpy returning a list instead of an array? Why would that be a better programming decision for the numpy developers to have made?

Solution

A comment pointed out that if the slit is uneven, the result can't be a array, at least not one that has the same dtype. At best it would be an object dtype.

But lets consider the case of equal length subarrays:

In [124]: x = np.arange(10)
In [125]: np.split(x,2)
Out[125]: [array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9])]
In [126]: np.array(_)     # make an array from that
Out[126]: 
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

But we can get the same array without split - just reshape:

In [127]: x.reshape(2,-1)
Out[127]: 
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

Now look at the code for split. It just passes the task to array_split. Ignoring the details about alternative axes, it just does

sub_arys = []
for i in range(Nsections):
    # st and end from `div_points
    sub_arys.append(sary[st:end])
return sub_arys

In other words, it just steps through array and returns successive slices. Those (often) are views of the original.

So split is not that sophisticate a function. You could generate such a list of subarrays yourself without a lot of numpy expertise.

Another point. Documentation notes that split can be reversed with an appropriate stack. concatenate (and family) takes a list of arrays. If give an array of arrays, or a higher dim array, it effectively iterates on the first dimension, e.g. concatenate(arr) => concatenate(list(arr)).