[SOLVED] Convert Python sequence to NumPy array, filling missing values

Convert Python sequence to NumPy array, filling missing values

The implicit conversion of a Python sequence of variable-length lists into a NumPy array cause the array to be of type object.

v = [[1], [1, 2]]
np.array(v)
>>> array([[1], [1, 2]], dtype=object)

Trying to force another type will cause an exception:

np.array(v, dtype=np.int32)
ValueError: setting an array element with a sequence.

What is the most efficient way to get a dense NumPy array of type int32, by filling the "missing" values with a given placeholder?

From my sample sequence v, I would like to get something like this, if 0 is the placeholder

array([[1, 0], [1, 2]], dtype=int32)

Solution

import itertools
np.array(list(itertools.zip_longest(*v, fillvalue=0))).T
Out: 
array([[1, 0],
       [1, 2]])

Note: For Python 2, it is itertools.izip_longest.