I am trying to follow a tutorial on big data, it wants to reads data from a keyspace defined with cqlsh.
I have compiled this piece of code successfully:
require 'rubygems'
require 'cassandra'
db = Cassandra.new('big_data', '127.0.0.1:9160')
# get a specific user's tags
row = db.get(:user_tags,"paul")
###
def tag_counts_from_row(row)
tags = {}
row.each_pair do |pair|
column, tag_count = pair
#tag_name = column.parts.first
tag_name = column
tags[tag_name] = tag_count
end
tags
end
###
# insert a new user
db.add(:user_tags, "todd", 3, "postgres")
db.add(:user_tags, "lili", 4, "win")
tags = tag_counts_from_row(row)
puts "paul - #{tags.inspect}"
but when I write this part to output everyone's tags I get an error.
user_ids = []
db.get_range(:user_tags, :batch_size => 10000) do |id|
# user_ids << id
end
rows_with_ids = db.multi_get(:user_tags, user_ids)
rows_with_ids.each do |row_with_id|
name, row = row_with_id
tags = tag_counts_from_row(row)
puts "#{name} - #{tags.inspect}"
end
the Error is:
line 33: warning: multiple values for a block parameter (2 for 1)
I think the error may have came from incompatible versions of Cassandra and Ruby. How to fix it?
Its a little hard to tell which line is 33, but it looks like the problem is that get_range
yields two values, but your block is only taking the first one. If you only care about the row keys and not the columns then you should use get_range_keys
.
It looks like you do in fact care about the column values because you fetch them out again using db.multi_get
. This is an unnecessary additional query. You can update your code to something like:
db.get_range(:user_tags, :batch_size => 10000) do |id, columns|
tags = tag_counts_from_row(columns)
puts "#{id} - #{tags.inspect}"
end