cassandracassandra-cli

How to get the raw row content in Cassandra 3.3


I am using Cassandra 3.3 and CQL to create the following table

CREATE TABLE collected_data ( 
    collection_hour int,
    source_id int, 
    entity_id int, 
    measurement text, 
    value text, 
    primary key((collection_hour),source_id,entity_id,measurement)  
);

After inserting a bunch of values into this table I wish to see how each row is really stored in Cassandra. For that I have seen that folks were using cassandra-cli (list command), but that is not available anymore in 3.3 ( post 3.0 )

Is there a way I can use to query cassandra to see how each row is really stored ? I am looking for some tool or any way to do this from CQL ...

Thank you

PS: in cassandra CLI one would use the the "list command" and get an output similar to the following (different table ofcourse):

RowKey: 1
=> (column=, value=, timestamp=1374546754299000)
=> (column=field2, value=00000002, timestamp=1374546754299000)
=> (column=field3, value=00000003, timestamp=1374546754299000)

RowKey: 4
=> (column=, value=, timestamp=1374546757815000)
=> (column=field2, value=00000005, timestamp=1374546757815000)
=> (column=field3, value=00000006, timestamp=1374546757815000)

Solution

  • The storage engine has been rewritten since Cassandra 3.0 so the on-disk layout has changed completely.

    There is no official documentation on this subject but you can look at several places in the source code to have a big picture of how data are laid on disk

    UnfilteredSerializer: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/rows/UnfilteredSerializer.java#L29-L71

    Cell storage: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/rows/Cell.java#L145-L163

    ClusteringPrefix: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ClusteringPrefix.java#L33-L45