python multidimensional-array vectorization python-xarray

New dimension with coordinates in xarray apply_ufunc

I have a 3D array of signals with dimensions ('experiment', 'trial', 'time'). I am trying to vectorize the computation of Welch's periodogram for each trial of each experiment using xr.apply_ufunc with scipy.signal.welch, but cannot get it to work around the dimensions. scipy.signal.welch returns two arrays, the frequencies and the PSD/power spectrum.

Creating random data:

    data = xr.DataArray(
        np.random.random((3, 2, 1024)),
        dims=['experiment', 'trial', 'time'],
        coords={
            'experiment': np.arange(3),
            'trial': np.arange(2),
            'time': np.arange(1024),
        },
    )

Now applying scipy.sig.welch for 1D input:

    ret = xr.apply_ufunc(
        sig.welch,
        data,
        input_core_dims=[['trial', 'time']],
        output_core_dims=[['frequency'], []],
        vectorize=True,
    )

throws

TypeError: only length-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

Traceback (most recent call last): .... numpy/lib/function_base.py", line 2506, in _vectorize_call_with_signature output[index] = result
ValueError: setting an array element with a sequence.

Probably the numpy vectorization expects the returned values to be scalars? Since sig.welch can work on 2D arrays, another attempt:

    ret = xr.apply_ufunc(
        sig.welch,
        data,
        input_core_dims=[['trial', 'time']],
        output_core_dims=[['trial', 'frequency'], []],
        vectorize=True,
    )

and this throws:

... numpy/lib/function_base.py", line 2050, in _update_dim_sizes raise ValueError( ValueError: 1-dimensional argument does not have enough dimensions for all core dimensions ('dim2', 'dim0')

Is there a way to do the vectorization, or must one loop through the top level dimension?

Solution

The problem is that scipy.signal.welch returns two arrays of different dimensions. So an workaround can be writing a wrapper function:

def wrapper(v, **kwargs):
    _, psd = signal.welch(v, **kwargs)
    return psd

which only returns the power spectral density and ignores the frequency. Then the vectorization can be

ret = xr.apply_ufunc(
        wrapper,
        data,
        input_core_dims=[['time']],
        output_core_dims=[['frequency']],
        vectorize=True,
    )

Then ret will have dimensions of ['experiment', 'trial', 'frequency'], as wrapper only change the 'time' to 'frequency' dimension. You can assign the coordinate values to frequency dimension after this by:

f, psd = sig.welch(data[0,0])
ret['frequency'] = f