pythondtw

distance_matrix_fast function of dtaidistance is slow


I'm using the Python package dtaidistance for fast DTW computations. As explained in the documentation, one can use the following code:

from dtaidistance import dtw
import numpy as np
series = np.matrix([
    [0.0, 0, 1, 2, 1, 0, 1, 0, 0],
    [0.0, 1, 2, 0, 0, 0, 0, 0, 0],
    [0.0, 0, 1, 2, 1, 0, 0, 0, 0]])
ds = dtw.distance_matrix_fast(series)

to compute DTW distance measures between sets of series. The time series I'm working with have a length of 3000. In total, I have roughly 3500 of those series for each of my data-sets.

Unfortunately, I'm not able to get any results from this function in a decent amount of time. On my machine (128 GB RAM, 32 CPU cores, 4 Nvidia GPUs) I had to abort the computations after a day. Surprisingly, I didn't even see any output from this function, even though I set the parameter "show_progress" (see source code) to true.

What am I doing wrong here? Thank you very much for your help.


Solution

  • It turned out that I simply didn't build the package from source and thus was not able to access the faster C-based implementation.

    The steps mentioned here solved the problem for me:

    The library can also be compiled and/or installed directly from source.

    Download the source from https://github.com/wannesm/dtaidistance
    Compile the C extensions: python3 setup.py build_ext --inplace
    Install into your site-package directory: python3 setup.py install