cassandrageomesa

Geomesa: is there a way to create, delete indexes in geomesa without loosing data? Or I need to recreate a schema?


I have a cassandra with geomesa, in there I have next schema

 ~  bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra describe-schema -P localhost:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder 
INFO  Describing attributes of feature 'SignalBuilder'
geo           | Point   (Spatio-temporally indexed)
time          | Date    (Spatio-temporally indexed) (Attribute indexed)
cam           | String  (Attribute indexed) (Attribute indexed)
imei          | String  (Attribute indexed)
dir           | Double  
alt           | Double  
vlc           | Double  
sl            | Integer 
ds            | Integer 
dir_y         | Double  
poi_azimuth_x | Double  
poi_azimuth_y | Double  

User data:
  geomesa.attr.splits     | 4
  geomesa.feature.expiry  | time(30 days)
  geomesa.index.dtg       | time
  geomesa.indices         | z3:7:3:geo:time,attr:8:3:time,attr:8:3:cam,attr:8:3:cam:time,attr:8:3:imei
  geomesa.stats.enable    | true
  geomesa.table.partition | time
  geomesa.z.splits        | 4
  geomesa.z3.interval     | week

Is there a way to to create z2 index additional to z3, and delete cam attribute index remain only cam:time attribute index without loosing data in db? Is time attribute index is unnecessary, if I already have cam:time index?

P.S. Why is this query using z2 index, not z3?

~  bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra explain -P 10.200.217.24:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder -q "Bbox(geo,1,1,2,2) and time > 123333"; 
Planning 'SignalBuilder' BBOX(geo, 1.0,1.0,2.0,2.0) AND time > 1970-01-01T00:02:03.333+00:00
  Original filter: BBOX(geo, 1.0,1.0,2.0,2.0) AND time > 123333
  Hints: bin[false] arrow[false] density[false] stats[false] sampling[none]
  Sort: none
  Transforms: none
  Max features: none
  Strategy selection:
    Query processing took 21ms for 1 options
    Filter plan: FilterPlan[Z2Index(geo)[BBOX(geo, 1.0,1.0,2.0,2.0)][time > 1970-01-01T00:02:03.333+00:00](1.2)]
    Strategy selection took 2ms for 1 options
  Strategy 1 of 1: Z2Index(geo)
    Strategy filter: Z2Index(geo)[BBOX(geo, 1.0,1.0,2.0,2.0)][time > 1970-01-01T00:02:03.333+00:00](1.2)
    Geometries: FilterValues(List(POLYGON ((1 1, 2 1, 2 2, 1 2, 1 1))),true,false)
    Plan: org.locationtech.geomesa.cassandra.data.StatementPlan
      Tables: 
      Ranges (0): 
      Client-side filter: BBOX(geo, 1.0,1.0,2.0,2.0) AND time > 1970-01-01T00:02:03.333+00:00
      Reduce: class:LocalTransformReducer, state:{name=SignalBuilder, tnam=, tsft=, tdef=, hint=RETURN_SFT,"SignalBuilder,""*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double;geomesa.stats.enable='true',geomesa.z.splits='4',geomesa.feature.expiry='time(30 days)',geomesa.table.partition='time',geomesa.index.dtg='time',geomesa.indices='z3:7:3:geo:time,z2:5:3:geo,attr:8:3:time,attr:8:3:cam,attr:8:3:cam:time',geomesa.attr.splits='4',geomesa.z3.interval='week'""", spec=*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double;geomesa.stats.enable='true',geomesa.z.splits='4',geomesa.feature.expiry='time(30 days)',geomesa.table.partition='time',geomesa.index.dtg='time',geomesa.indices='z3:7:3:geo:time,z2:5:3:geo,attr:8:3:time,attr:8:3:cam,attr:8:3:cam:time',geomesa.attr.splits='4',geomesa.z3.interval='week', filt=BBOX(geo, 1.0,1.0,2.0,2.0) AND time > 1970-01-01T00:02:03.333+00:00}
    Plan creation took 110ms
  Query planning took 294ms

Solution

  • This will involve several steps. You probably want to back up your data before attempting this, in case something goes wrong. First, you want to use updateSchema to add the new index and remove the old one, and set the existing indices to "read only" mode. You can use the GeoMesa CLI scala-console command to run the following:

    val sft = SimpleFeatureTypes.mutable(ds.getSchema("SignalBuilder"))
    // 5 is the latest version of the z2 index as of now
    // 1 sets the indices to read only mode
    sft.getUserData.put("geomesa.indices", "z3:7:1:geo:time,z2:5:3:geo,attr:8:1:time,attr:8:1:cam:time,attr:8:1:imei")
    ds.updateSchema(sft.getTypeName, sft)
    

    After this, you will need to re-ingest your data to populate the z2 index. Once the index is populated, you'll need to update the schema again to set the indices back to read/write mode:

    val sft = SimpleFeatureTypes.mutable(ds.getSchema("SignalBuilder"))
    // 5 is the latest version of the z2 index as of now
    // 3 sets the indices to read/write mode
    sft.getUserData.put("geomesa.indices", "z3:7:3:geo:time,z2:5:3:geo,attr:8:3:time,attr:8:3:cam:time,attr:8:3:imei")
    ds.updateSchema(sft.getTypeName, sft)
    

    Note that the old cam index table will still be around in Cassandra, but won't receive any further updates or be used for queries. You can drop it using standard Cassandra techniques.