I plan to have a simple table like this (simple key/value use case) :
CREATE TABLE my_data (
id bigint,
value blob,
PRIMARY KEY (id)
)
With the following caracteristics :
as you can see, one partition = one blob (value)
each value is always accessed by the corresponding key
each value is a blob of 1MB max (average also 1 MB)
with 1MB blob, it give 60 millions partitions
What do you think about the 1MB blob ? Is that OK for Cassandra ?
Indeed, I can divide my data further, to work with 1ko blob, but in that case, it will lead to many more partitions on Cassandra (more than 600 millions ?), and many more partitions to retreive the data for a same client side query..
Thanks
The general recommendation is to keep the partition size as small as possible maybe not to exceed beyond 5~10MB. However in your case, 1MB blobs is a strong recommendation.
600 million partitions is not a problem at all. Cassandra is designed to handle billions, trillions of partitions and beyond. Cheers!