I have cassandra with geomesa, in there I have next schema
~ bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra describe-schema -P localhost:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder
INFO Describing attributes of feature 'SignalBuilder'
geo | Point (Spatio-temporally indexed)
time | Date (Spatio-temporally indexed) (Attribute indexed)
cam | String (Attribute indexed) (Attribute indexed)
imei | String (Attribute indexed)
dir | Double
alt | Double
vlc | Double
sl | Integer
ds | Integer
dir_y | Double
poi_azimuth_x | Double
poi_azimuth_y | Double
User data:
geomesa.attr.splits | 4
geomesa.feature.expiry | time(30 days)
geomesa.index.dtg | time
geomesa.indices | z3:7:3:geo:time,attr:8:3:time,attr:8:3:cam,attr:8:3:cam:time,attr:8:3:imei
geomesa.stats.enable | true
geomesa.table.partition | time
geomesa.z.splits | 4
geomesa.z3.interval | week
when I do the next request, it is really fast:
~ date
Tues 11 Jan 2022 18:24:24 MSK
~ bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra export -P 10.200.217.24:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder -q "cam='5798a065-d51e-47a1-b04b-ab48df9f1324'" -m 10
INFO Running export - please wait...
id,*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double
5798a065-d51e-47a1-b04b-ab48df9f1324-1638361381456,POINT (37.715528122 55.713859037),2021-12-01T12:23:01.456Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,0.0,60.0,5,1,0.0,0.0,0.0
5798a065-d51e-47a1-b04b-ab48df9f1324-1640706820000,POINT (31.970000000000002 56.88),2021-12-28T15:53:40.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,0.0,1.23,1,1,0.0,0.0,0.0
5798a065-d51e-47a1-b04b-ab48df9f1324-1640797130000,POINT (31.39951572344312 68.98547135697989),2021-12-29T16:58:50.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1640797434000,POINT (31.399484359174057 69.07515114832239),2021-12-29T17:03:54.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1640797627000,POINT (31.39948363272515 69.09555283117254),2021-12-29T17:07:07.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1640797784000,POINT (31.39948340746396 69.10451379934774),2021-12-29T17:09:44.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1640797993000,POINT (31.39948291171159 69.11711406549725),2021-12-29T17:13:13.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1640798006000,POINT (31.399482907612764 69.11813926100234),2021-12-29T17:13:26.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1640798020000,POINT (31.39948286587965 69.12141054010591),2021-12-29T17:13:40.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1640798414000,POINT (31.399453814973754 69.20771971657824),2021-12-29T17:20:14.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
~ date
Tues 11 Jan 2022 18:24:31 MSK
But when I make next request by radius DWITHIN(geo, POINT (80.57944937597278 69.63928679990498), 130.0, meters) It takes more than 4 minutes
✘ ~ date
Tues 11 Jan 2022 17:42:10 MSK
~ bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra export -P 10.200.217.24:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder -q "DWITHIN(geo, POINT (80.57944937597278 69.63928679990498), 130.0, meters)"
INFO Running export - please wait...
id,*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double
5798a065-d51e-47a1-b04b-ab48df9f1324-1641900233000,POINT (80.57844937597278 69.63828679990498),2022-01-11T11:23:53.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
~ date
Tues 11 Jan 2022 17:46:49 MSK
Explain shows it uses Z3Index:
✘ ~ bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra explain -P 10.200.217.24:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder -q "DWITHIN(geo, POINT (80.57944937597278 69.63928679990498), 130.0, meters)"
Planning 'SignalBuilder' DWITHIN(geo, POINT (80.57944937597279 69.63928679990498), 130.0, meters)
Original filter: DWITHIN(geo, POINT (80.57944937597279 69.63928679990498), 130.0, meters)
Hints: bin[false] arrow[false] density[false] stats[false] sampling[none]
Sort: none
Transforms: none
Max features: none
Strategy selection:
Query processing took 19ms for 1 options
Filter plan: FilterPlan[Z3Index(geo,time)[INCLUDE][DWITHIN(geo, POINT (80.57944937597279 69.63928679990498), 130.0, meters)](Infinity)]
Strategy selection took 3ms for 1 options
Strategy 1 of 1: Z3Index(geo,time)
Strategy filter: Z3Index(geo,time)[INCLUDE][DWITHIN(geo, POINT (80.57944937597279 69.63928679990498), 130.0, meters)](Infinity)
Plan: org.locationtech.geomesa.cassandra.data.StatementPlan
Tables: gsm_events_SignalBuilder_z3_geo_time_v7_02576, gsm_events_SignalBuilder_z3_geo_time_v7_02579, gsm_events_SignalBuilder_z3_geo_time_v7_02708, gsm_events_SignalBuilder_z3_geo_time_v7_02711, gsm_events_SignalBuilder_z3_geo_time_v7_02712, gsm_events_SignalBuilder_z3_geo_time_v7_02713, gsm_events_SignalBuilder_z3_geo_time_v7_02714, gsm_events_SignalBuilder_z3_geo_time_v7_02715
Ranges (8): SELECT * FROM geomesa.gsm_events_SignalBuilder_z3_geo_time_v7_02576;, SELECT * FROM geomesa.gsm_events_SignalBuilder_z3_geo_time_v7_02579;, SELECT * FROM geomesa.gsm_events_SignalBuilder_z3_geo_time_v7_02708;, SELECT * FROM geomesa.gsm_events_SignalBuilder_z3_geo_time_v7_02711;, SELECT * FROM geomesa.gsm_events_SignalBuilder_z3_geo_time_v7_02712;
Client-side filter: DWITHIN(geo, POINT (80.57944937597279 69.63928679990498), 130.0, meters)
Reduce: class:LocalTransformReducer, state:{name=SignalBuilder, tnam=, tsft=, tdef=, hint=RETURN_SFT,"SignalBuilder,""*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double;geomesa.stats.enable='true',geomesa.z.splits='4',geomesa.feature.expiry='time(30 days)',geomesa.table.partition='time',geomesa.index.dtg='time',geomesa.indices='z3:7:3:geo:time,attr:8:3:time,attr:8:3:cam,attr:8:3:cam:time',geomesa.attr.splits='4',geomesa.z3.interval='week'""", spec=*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double;geomesa.stats.enable='true',geomesa.z.splits='4',geomesa.feature.expiry='time(30 days)',geomesa.table.partition='time',geomesa.index.dtg='time',geomesa.indices='z3:7:3:geo:time,attr:8:3:time,attr:8:3:cam,attr:8:3:cam:time',geomesa.attr.splits='4',geomesa.z3.interval='week', filt=DWITHIN(geo, POINT (80.57944937597279 69.63928679990498), 130.0, meters)}
Plan creation took 87ms
Query planning took 263ms
Why this last query is so long? And is there any way to make faster the request?
P.s. All db has 4_754_928 records
P.s.s Found that any query by geo figure takes about five minutes.
BBox has the same behavior, and takes something like 5 minutes, on cassandra cluster
~ bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra export -P 10.200.217.24:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder -q "BBOX(geo, 79.01240030621715, 68.63828679990498, 81.01240030621715, 70.63828679990498)"
INFO Running export - please wait...
id,*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double
5798a065-d51e-47a1-b04b-ab48df9f1324-1641895055000,POINT (80.01240030621715 69.63828679990498),2022-01-11T09:57:35.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1641895317000,POINT (80.13819338413361 69.63828679990498),2022-01-11T10:01:57.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
5798a065-d51e-47a1-b04b-ab48df9f1324-1641900233000,POINT (80.57844937597278 69.63828679990498),2022-01-11T11:23:53.000Z,5798a065-d51e-47a1-b04b-ab48df9f1324,1234-2222-3333-4444-55555,10.8,,1.23,5,1,0.0,,
P.S.S.S For geom like BBOX I needed to have z2 index.
Why when I have z3 (POINT(lat,lon),time) index this query is long too
bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra export -P 10.200.217.24:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder -q "BBOX(geo, 79.0, 30.0, 81.0, 40.0) AND time AFTER 2021-09-23T13:05:00.0Z"
^C% ✘ ~ bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra explain -P 10.200.217.24:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder -q "BBOX(geo, 79.0, 30.0, 81.0, 40.0) AND time AFTER 2021-09-23T13:05:00.0Z"
Planning 'SignalBuilder' BBOX(geo, 79.0,30.0,81.0,40.0) AND time AFTER 2021-09-23T13:05:00+00:00
Original filter: BBOX(geo, 79.0,30.0,81.0,40.0) AND time AFTER 2021-09-23T13:05:00+00:00
Hints: bin[false] arrow[false] density[false] stats[false] sampling[none]
Sort: none
Transforms: none
Max features: none
Strategy selection:
Query processing took 35ms for 2 options
Filter plan selected: FilterPlan[AttributeIndex(time)[time AFTER 2021-09-23T13:05:00+00:00][BBOX(geo, 79.0,30.0,81.0,40.0)](2.5)] (Cost 250 in 0ms)
Filter plans not selected: FilterPlan[Z3Index(geo,time)[BBOX(geo, 79.0,30.0,81.0,40.0) AND time AFTER 2021-09-23T13:05:00+00:00][None](3.0)] (Cost 300 in 0ms)
Strategy selection took 12ms for 2 options
Strategy 1 of 1: AttributeIndex(time)
Strategy filter: AttributeIndex(time)[time AFTER 2021-09-23T13:05:00+00:00][BBOX(geo, 79.0,30.0,81.0,40.0)](2.5)
Plan: org.locationtech.geomesa.cassandra.data.StatementPlan
Tables: gsm_events_SignalBuilder_attr_time_v8_02576, gsm_events_SignalBuilder_attr_time_v8_02579, gsm_events_SignalBuilder_attr_time_v8_02708, gsm_events_SignalBuilder_attr_time_v8_02711, gsm_events_SignalBuilder_attr_time_v8_02712, gsm_events_SignalBuilder_attr_time_v8_02713, gsm_events_SignalBuilder_attr_time_v8_02714, gsm_events_SignalBuilder_attr_time_v8_02715
Ranges (32): SELECT * FROM geomesa.gsm_events_SignalBuilder_attr_time_v8_02576 WHERE shard=0 AND attrVal>'8000017c12c25c60';, SELECT * FROM geomesa.gsm_events_SignalBuilder_attr_time_v8_02576 WHERE shard=1 AND attrVal>'8000017c12c25c60';, SELECT * FROM geomesa.gsm_events_SignalBuilder_attr_time_v8_02576 WHERE shard=2 AND attrVal>'8000017c12c25c60';, SELECT * FROM geomesa.gsm_events_SignalBuilder_attr_time_v8_02576 WHERE shard=3 AND attrVal>'8000017c12c25c60';, SELECT * FROM geomesa.gsm_events_SignalBuilder_attr_time_v8_02579 WHERE shard=0 AND attrVal>'8000017c12c25c60';
Client-side filter: BBOX(geo, 79.0,30.0,81.0,40.0)
Reduce: class:LocalTransformReducer, state:{name=SignalBuilder, tnam=, tsft=, tdef=, hint=RETURN_SFT,"SignalBuilder,""*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double;geomesa.stats.enable='true',geomesa.z.splits='4',geomesa.feature.expiry='time(30 days)',geomesa.table.partition='time',geomesa.index.dtg='time',geomesa.indices='z3:7:3:geo:time,attr:8:3:time,attr:8:3:cam,attr:8:3:cam:time',geomesa.attr.splits='4',geomesa.z3.interval='week'""", spec=*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double;geomesa.stats.enable='true',geomesa.z.splits='4',geomesa.feature.expiry='time(30 days)',geomesa.table.partition='time',geomesa.index.dtg='time',geomesa.indices='z3:7:3:geo:time,attr:8:3:time,attr:8:3:cam,attr:8:3:cam:time',geomesa.attr.splits='4',geomesa.z3.interval='week', filt=BBOX(geo, 79.0,30.0,81.0,40.0)}
Plan creation took 9061ms
Query planning took 9324ms
It is return nothing, but takes about 5 minutes
~ bin/geomesa-cassandra_2.11-3.3.0/bin/geomesa-cassandra export -P 10.200.217.24:9042 -u cassandra -p cassandra -k geomesa -c gsm_events -f SignalBuilder -q "BBOX(geo, 79.0, 30.0, 81.0, 40.0) AND time AFTER 2021-09-23T13:05:00.0Z";
INFO Running export - please wait...
id,*geo:Point,time:Date,cam:String,imei:String,dir:Double,alt:Double,vlc:Double,sl:Integer,ds:Integer,dir_y:Double,poi_azimuth_x:Double,poi_azimuth_y:Double
Thank you!
The query is slow because it's doing a full scan of all your data. If you look at the explain output, you'll see:
Ranges (8): SELECT * FROM geomesa.gsm_events_SignalBuilder_z3_geo_time_v7_02576;...
Client-side filter: DWITHIN(geo, POINT (80.57944937597279 69.63928679990498), 130.0, meters)
If you want to query just by spatial predicates, you should re-enable the z2
index, which is designed for this use case. See Index Overview and Customizing Index Creation in the GeoMesa documentation for details.