cassandradatastaxdsbulk

DataStax DSBulk - Difference between query / table unload


I'm using dsbulk to try to extract some data from our cassandra cluster, and seeing some odd behavior. Trying to understand if this is expected.

If I perform an unload by specifying tablespace and table, I'm seeing different (less) results than if I perform a query unload specifying select * from table.

I assumed this might be a consistency issue within the cluster, but I've tried various consistency levels, and the results are the same at all levels between ONE and ALL.

Anyone know if this is expected behavior? The direct table extract is about 2x faster, so would prefer that if at all possible.


Solution

  • You are certainly hitting DAT-295, a bug that was fixed since. Please upgrade to the latest DSBulk version (1.2.0 atm - 1.3.0 is due in a few weeks).