pythonmodelcassandraclustering-key

Cassandra abstract model cant define primary_key with clustering order


I am creating models in python, below is my code

from uuid import uuid4
from uuid import uuid1

from cassandra.cqlengine import columns, connection
from cassandra.cqlengine.models import Model
from cassandra.cqlengine.management import sync_table


class BaseModel(Model):
    __abstract__ = True

    deleted = columns.Boolean(required=True, default=False)
    created_timestamp = columns.TimeUUID(primary_key=True,
                                         clustering_order='DESC',
                                         default=uuid1)

class OtherModel(BaseModel):
    __table_name__ = 'other_table'
    id = columns.UUID(primary_key=True, default=uuid4)



if __name__ == '__main__':
    connection.setup(hosts=['localhost'],
                     default_keyspace='test')
    sync_table(OtherModel)

This gives error

python /tmp/test.py
Traceback (most recent call last):
  File "/tmp/test.py", line 9, in <module>
    class BaseModel(Model):
  File "/usr/lib/python2.7/site-packages/cassandra/cqlengine/models.py", line 905, in __new__
    raise ModelException("clustering_order may be specified only for clustering primary keys")
cassandra.cqlengine.models.ModelException: clustering_order may be specified only for clustering primary keys

If I put comment on clusturing_order then its working fine.

class BaseModel(Model):
    __abstract__ = True

    deleted = columns.Boolean(required=True, default=False)
    created_timestamp = columns.TimeUUID(primary_key=True,
    #                                    clustering_order='DESC',
                                         default=uuid1)

Is there any way to define clusting_order in abstract class ?

I have to create created_timestamp in each model, so I cant move it to each model.


Solution

  • There are two components to a PRIMARY KEY in Cassandra: the partition key and the clustering key. The partition key determines which node(s) in the cluster the row will be stored on. The clustering key determines the on-disk sort order of the rows within that partition key.

    Also important to note, but partition keys cannot be ordered. Multi-partition key results are always returned in order of their hashed token value.

    It looks like your base model only has one column (created_timestamp) defined as a key, therefore that is your partition key. I do not see another key defined, so you do not have a clustering key, and therefore you cannot apply a clustering_order.

    In Cassandra, you need to define your tables according to the queries that you intend to serve. Without seeing your queries, it is difficult to tell you how to fix this. But essentially, you need to use a different partition key and designate created_timestamp as a clustering key to get the desired results.

    For more information, you should read Carlo's answer here:

    Difference between partition key, composite key and clustering key in Cassandra?