pythonlistnumpynestedaverage

How to compute the element-wise mean of a nested list with mixed scalars and numpy arrays


I have a list of lists that looks something like:

data = [
    [1., np.array([2., 3., 4.]), ...],
    [5., np.array([6., 7., 8.]), ...],
    ...
]

where each of the internal lists is the same length and contains the same data type/shape at each entry. I would like to calculate the mean over corresponding entries and return something of the same structure as the internal lists. For example, in the above case (assuming only two entries) I want the result to be:

[3., np.array([4., 5., 6.]), ...]

What is the best way to do this with Python?


Solution

  • data is a list, so a list comprehension seems like a natural option. Even if it were a numpy array, given that it's a jagged array, it wouldn't benefit from being wrapped in an ndarray anyway, so a list comp would still be the best option, in my opinion.

    Anyway, use zip() to "transpose" data and call np.mean() in a loop to find mean along the first axis.

    [np.mean(x, axis=0) for x in zip(*data)]
    # [3.0, array([4., 5., 6.]), array([[2., 2.], [2., 2.]])]