pythonmultidimensional-arrayvectorizationpython-xarray

New dimension with coordinates in xarray apply_ufunc


I have a 3D array of signals with dimensions ('experiment', 'trial', 'time'). I am trying to vectorize the computation of Welch's periodogram for each trial of each experiment using xr.apply_ufunc with scipy.signal.welch, but cannot get it to work around the dimensions. scipy.signal.welch returns two arrays, the frequencies and the PSD/power spectrum.

Creating random data:

    data = xr.DataArray(
        np.random.random((3, 2, 1024)),
        dims=['experiment', 'trial', 'time'],
        coords={
            'experiment': np.arange(3),
            'trial': np.arange(2),
            'time': np.arange(1024),
        },
    )

Now applying scipy.sig.welch for 1D input:

    ret = xr.apply_ufunc(
        sig.welch,
        data,
        input_core_dims=[['trial', 'time']],
        output_core_dims=[['frequency'], []],
        vectorize=True,
    )

throws

TypeError: only length-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

Traceback (most recent call last): .... numpy/lib/function_base.py", line 2506, in _vectorize_call_with_signature output[index] = result

ValueError: setting an array element with a sequence.

Probably the numpy vectorization expects the returned values to be scalars? Since sig.welch can work on 2D arrays, another attempt:

    ret = xr.apply_ufunc(
        sig.welch,
        data,
        input_core_dims=[['trial', 'time']],
        output_core_dims=[['trial', 'frequency'], []],
        vectorize=True,
    )

and this throws:

... numpy/lib/function_base.py", line 2050, in _update_dim_sizes raise ValueError( ValueError: 1-dimensional argument does not have enough dimensions for all core dimensions ('dim2', 'dim0')

Is there a way to do the vectorization, or must one loop through the top level dimension?


Solution

  • The problem is that scipy.signal.welch returns two arrays of different dimensions. So an workaround can be writing a wrapper function:

    def wrapper(v, **kwargs):
        _, psd = signal.welch(v, **kwargs)
        return psd
    

    which only returns the power spectral density and ignores the frequency. Then the vectorization can be

    ret = xr.apply_ufunc(
            wrapper,
            data,
            input_core_dims=[['time']],
            output_core_dims=[['frequency']],
            vectorize=True,
        )
    

    Then ret will have dimensions of ['experiment', 'trial', 'frequency'], as wrapper only change the 'time' to 'frequency' dimension. You can assign the coordinate values to frequency dimension after this by:

    f, psd = sig.welch(data[0,0])
    ret['frequency'] = f