cassandranosqldata-modelingcolumn-orientedwide-column-store

Why many refer to Cassandra as a Column oriented database?


Reading several papers and documents on internet, I found many contradictory information about the Cassandra data model. There are many which identify it as a column oriented database, other as a row-oriented and then who define it as a hybrid way of both.

According to what I know about how Cassandra stores file, it uses the *-Index.db file to access at the right position of the *-Data.db file where it is stored the bloom filter, column index and then the columns of the required row.

In my opinion, this is strictly row-oriented. Is there something I'm missing?


Solution

  • Cassandra is a partitioned row store. Rows are organized into tables with a required primary key.

    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster.

    Row store means that like relational databases, Cassandra organizes data by rows and columns.

         "Bonuses" : {
               row1 : { "ID":1, "Last":"Doe", "First":"John", "Bonus":8000},
               row2 : { "ID":2, "Last":"Smith", "First":"Jane", "Bonus":4000}
               ...
         }