cassandrahiveread-unreadbrisk

Brisk cassandra TimeUUIDType


I used brisk. The cassandra column family automatically maps to Hive tables.
However, if data type is timeuuid in column family, it is unreadable in Hive tables.

For example, I used following command to create an external table in hive to map column family.

Hive > create external table A (rowkey string, column_name string, value string) 
     > STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'
     > WITH SERDEPROPERTIES (
     > "cassandra.columns.mapping" = ":key,:column,:value");  

If column name is TimeUUIDType in cassandra, it becomes unreadable in the Hive table.

For example, a row in cassandra column family looks like:

RowKey: 2d36a254bb04272b120aaf79d70a3578  
        => (column=29139210-b6dc-11df-8c64-f315e3a329d6, value={"event_id":101},timestamp=1283464254261)

Where column name is TimeUUIDType.

In hive table, it looks like the following row:

 2d36a254bb04272b120aaf79d70a3578    t��ߒ4��!��   {"event_id":101}

So, column name is unreadable in Hive table.


Solution

  • This is a known issue with the automatic table mapping. For best results with a timeUUIDType, turn the auto-mapping feature off in $brisk_home/resources/hive/hive-site.xml: "cassandra.autoCreateHiveSchema"

    and create the table in hive manually.