python-2.7daskcastra

Not able to load castra files with from_castra() function of dask


I am trying to replicate the example of this page about castra, dask and reddit comments, and I get the above error when I run the

dd.from_castra(data,columns)

My castra file took some hours to be created but it is clean and exactly as the tutorial mentions.

Used both my macbook and an Ubuntu instance on amazon AWS and had the same issue.

Same code and file works fine in a windows PC.

Any info could be helpful!

ValueError: Expected iterable of tuples of (name, dtype), got ['archived', 'author',....]


Solution

  • I found the solution to the problem. It was a matter of versions. If you face the same issue do the follwing:

    Step 1:

    Uninstall dask using pip

    pip uninstall dask
    

    Step 2:

    Uninstall castra using pip

    pip uninstall castra
    

    Step 3:

    Install the version of dask which is compatible with castra

    pip install -Iv dask==0.10.0
    

    Step 4:

    Install castra again

    pip install castra
    

    Step 5:

    After you install the correct version, check your versions with the following commands

    pip show dask
    pip show castra
    

    The versions should be equal to the ones in the screenshot:

    terminal screen