pythoncassandracassandra-python-driver

Logging all queries with cassandra-python-driver


I'm trying to find a way to log all queries done on a Cassandra from a python code. Specifically logging as they're done executing using a BatchStatement

Are there any hooks or callbacks I can use to log this?


Solution

  • 2 options:

    1. Stick to session.add_request_init_listener

      From the source code:

      a) BoundStatement

      https://github.com/datastax/python-driver/blob/3.11.0/cassandra/query.py#L560

      The passed values are stored in raw_values, you can try to extract it

      b) BatchStatement

      https://github.com/datastax/python-driver/blob/3.11.0/cassandra/query.py#L676

      It stores all the statements and parameters used to construct this object in _statements_and_parameters. Seems it can be fetched although it’s not a public property

      c) Only this hook is called, I didn’t manage to find any other hooks https://github.com/datastax/python-driver/blob/master/cassandra/cluster.py#L2097

      But it has nothing to do with queries actual execution - it's just a way to inspect what kind of queries has been constructed and maybe add additional callbacks/errbacks

    2. Approach it from a different angle and use traces

      https://datastax.github.io/python-driver/faq.html#how-do-i-trace-a-request https://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.ResponseFuture.get_all_query_traces

      Request tracing can be turned on for any request by setting trace=True in Session.execute_async(). View the results by waiting on the future, then ResponseFuture.get_query_trace()

    Here's an example of BatchStatement tracing using option 2:

    bs = BatchStatement()                                                        
    bs.add_all(['insert into test.test(test_type, test_desc) values (%s, %s)',   
                'insert into test.test(test_type, test_desc) values (%s, %s)',   
                'delete from test.test where test_type=%s',
                'update test.test set test_desc=%s where test_type=%s'],
               [['hello1', 'hello1'],                                            
                ['hello2', 'hello2'],                                            
                ['hello2'],
                ['hello100', 'hello1']])     
    res = session.execute(bs, trace=True)                                        
    trace = res.get_query_trace()                                                
    for event in trace.events:                                                   
        if event.description.startswith('Parsing'):                              
            print event.description 
    

    It produces the following output:

    Parsing insert into test.test(test_type, test_desc) values ('hello1', 'hello1')
    Parsing insert into test.test(test_type, test_desc) values ('hello2', 'hello2')
    Parsing delete from test.test where test_type='hello2'
    Parsing update test.test set test_desc='hello100' where test_type='hello1'