Hi I have following table in Cassandra:
* ---------------------------------------------------------------------------
* Note:
* 'curr_pos' is always fixed, so we can put it into cluster key and order
* In each crawler iteration 'prev_pos', 'domain_*' are updated
* -------------------------------------------------------------------------
* Patterns:
* <domain_name3rd>.<domain_name2nd>.<domain_name1st>
* --------------------------------------------------------------------------
CREATE TABLE IF NOT EXISTS lp_registry.keyword_position (
engine text,
keyword text,
updated timestamp,
domain_name1st text,
domain_name2nd text,
domain_name3rd text,
prev_pos int,
curr_pos int,
PRIMARY KEY ((engine, keyword), curr_pos)
);
In the top-level application I have a lists with circa hundreds of keywords.
What I need?
For fixed engine and keyword list i want to select all domains and their position.
Update: The result given by application will be a NxM matrix for each engine, with N user defined keywords and M user defined domains. In each cell will be position of domain for specific keyword.
What I am confused with?
I need to post N selects depending on size of the list with keywords. In other words, I need iterate through keywords in the app and in the each iteration send select to DB.
I expect that N won't be greater than 100, but still i think that this is too many queries.
My question
Can I pack these selects into a single batch? How?
It is not really a problem of batch query but a problem with the design of your table.
If the query you're describing is a "core" query of your application then you should design the table in a way that this is a single query, ie. engine
and keyword
should be clustering keys and not partition keys.
To give more specific advice: how do you obtain the list of engines and keywords, is there some that is logically grouping them? That could be the partition key of your table.