I am using Cassandra 3.3 and CQL to create the following table
CREATE TABLE collected_data (
collection_hour int,
source_id int,
entity_id int,
measurement text,
value text,
primary key((collection_hour),source_id,entity_id,measurement)
);
After inserting a bunch of values into this table I wish to see how each row is really stored in Cassandra. For that I have seen that folks were using cassandra-cli (list command), but that is not available anymore in 3.3 ( post 3.0 )
Is there a way I can use to query cassandra to see how each row is really stored ? I am looking for some tool or any way to do this from CQL ...
Thank you
PS: in cassandra CLI one would use the the "list command" and get an output similar to the following (different table ofcourse):
RowKey: 1
=> (column=, value=, timestamp=1374546754299000)
=> (column=field2, value=00000002, timestamp=1374546754299000)
=> (column=field3, value=00000003, timestamp=1374546754299000)
RowKey: 4
=> (column=, value=, timestamp=1374546757815000)
=> (column=field2, value=00000005, timestamp=1374546757815000)
=> (column=field3, value=00000006, timestamp=1374546757815000)
The storage engine has been rewritten since Cassandra 3.0 so the on-disk layout has changed completely.
There is no official documentation on this subject but you can look at several places in the source code to have a big picture of how data are laid on disk
UnfilteredSerializer
: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/rows/UnfilteredSerializer.java#L29-L71
Cell
storage: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/rows/Cell.java#L145-L163
ClusteringPrefix
: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ClusteringPrefix.java#L33-L45