cassandracassandra-2.0datastax

How to obtain number of rows in Cassandra table


This is a super basic question but it's actually been bugging me for days. Is there a good way to obtain the equivalent of a COUNT(*) of a given table in Cassandra?

I will be moving several hundreds of millions of rows into C* for some load testing and I'd like to at least get a row count on some sample ETL jobs before I move massive amounts of data over the network.

The best idea I have is to basically loop over each row with Python and auto increment a counter. Is there a better way to determine (or even estimate) the row size of a C* table? I've also poked around Datastax Ops Center to see if I can determine the row size there. If you can, I don't see how it's possible.

Anyone else needed to get a count(*) of a table in C*? If so, how'd you go about doing it?


Solution

  • Yes, you can use COUNT(*). Here's the documentation.

    A SELECT expression using COUNT(*) returns the number of rows that matched the query. Alternatively, you can use COUNT(1) to get the same result.

    Count the number of rows in the users table:

    SELECT COUNT(*) FROM users;