I have encountered very strange behavior and am trying to understand in which cases this may occur. In my Python application, I access the Cassandra database via the driver.
As you can see below, first, I do an INSERT
operation, which creates a record in the table. Next, I do a SELECT
operation that should return the last message that was created earlier. Sometimes the select operation returns empty values to me. I have an assumption that Cassandra has an internal scheduler that takes the INSERT task to work. However, when I try to get the last record through the SELECT operation, the record has not yet been created. Is this possible?
QUESTION:
Is it possible to get a callback from Cassandra after the INSERT operation that the record was created successfully?
SNIPPET:
import uuid
import sys
from cassandra import ConsistencyLevel
from cassandra.query import SimpleStatement, dict_factory
def send_message(chat_room_id, message_author_id, message_text):
message_id = uuid.uuid1()
first_query = """
insert into messages (
created_date_time,
chat_room_id,
message_id,
message_author_id,
message_text
) values (
toTimestamp(now()),
{0},
{1},
{2},
{3}
);
""".format(
chat_room_id,
message_id,
message_author_id,
message_text
)
first_statement = SimpleStatement(
first_query,
consistency_level=ConsistencyLevel.LOCAL_QUORUM
)
try:
db_connection.execute(first_statement)
except Exception as error:
logger.error(error)
sys.exit(1)
db_connection.row_factory = dict_factory
second_query = """
select
created_date_time,
chat_room_id,
message_id,
message_author_id,
message_text
from
messages
where
chat_room_id = {0}
and
message_id = {1}
limit 1;
""".format(
chat_room_id,
message_id
)
try:
message = db_connection.execute(second_query).one()
except Exception as error:
logger.error(error)
sys.exit(1)
print(message) # Sometimes when it's the first message in the chat room I see a "None" value.
When you execute the first insert statement and you get the result, that means Cassandra completed your insert statement.
It looks like you are inserting with consistency level(CL) of LOCAL_QUORUM
but CL is not set when you select the same record.
By default, python driver uses LOCAL_ONE
for consistency level if it is not set.
In your case, when you insert the record with LOCAL_QUORUM
, assuming you have replication factor of 3, then at least 2 replica nodes out of 3 have your data.
(note that Cassandra always tries to write to all the replica nodes.)
And then you query with LOCAL_ONE
, you may hit those 2 nodes and get the result, or you may hit the one that failed to write your record.
In order to achieve strong consistency in Cassandra, you have to use LOCAL_QUORUM
for reads and writes.
Try using LOCAL_QUORUM
for select also, or set the default consistency level to LOCAL_QUORUM
through default execution profile: https://docs.datastax.com/en/developer/python-driver/3.24/getting_started/#execution-profiles