phpcassandraphpcassa

Row Iteration not working


My goal is to iterate over all rows in a specific ColumnFamily in a node.
Here is the php code (using my wrapper over phpcassa):

$ring = $cass_db->describe_ring();

foreach ($ring as $ring_details)
{
    $start_token = $ring_details->start_token;
    $end_token   = $ring_details->end_token;

    if ($start_token != null && $end_token != null)
    {
        $i = 0;
        $batch_size = 10;

        $params = array(
            'token_start' => $start_token,
            'token_finish' => $end_token,
            'row_count'     => $batch_size,
            'buffer_size'   => 1000
        );

        while ($batch = $cass_db->get_range_by_token('myColumnFamily', $params))
        {
            var_dump('Batch# '.$i);

            foreach ($batch as $row)
            {
                $row_key     = $row[0];
                $row_values  = $row[1];
                var_dump($row_key);                 
            }

            $i++;

            //Just to stop infinite loop
            if ($i > 14)
            {
                die(); 
            }

        }
    }
}

In each batch I get the same 10 row keys.
How to iterate over all existing rows in a large Cassandra DB?


Solution

  • I am not a PHP developer so I may misunderstand something in your code. More, you did not specify which cassandra version you are using.

    Iteration on all rows is generally done starting and ending with an empty token, and redefining the start token in each iteration. In your code I can't see where you redefine token_start in each iteration. If you don't redefine it you're querying cassandra everytime for the same range of tokens and you will get always the same resultset.

    Your code should do something like this ...

    start_token = '';
    end_token = '';
    page_size = 100;
    while ( get_range_by_token('cf', start_token, end_token, page_size) {
       // here I should get page_size rows (unless I'm in last iteration or table rows is smaller than page_size elements)
       start_token = rows[rows.size()].getKey();
    }
    

    HTH, Carlo