Assuming the following data set, of locations and sensors that have activated (and ignoring time windows at this point for simplicity):
Location1 Sensor1
Location1 Sensor1
Location1 Sensor2
Location2 Sensor1
Location2 Sensor1
I want to create an Esper query that returns the fact that Location1 has had multiple sensors activated (Sensor1 * Sensor 2).
Whilst Location2 has had multiple activations they were all from the same sensor (Sensor1) so this should not trigger.
In a nutshell I want to say "Tell me when more than one sensor activates at a location".
In pure SQL terms I can achieve this through the following query:
SELECT location from
(SELECT DISTINCT location, sensor FROM sensorlog) a
GROUP BY location
HAVING count(location) > 1
I'm not able to transpose this directly into Esper though as it seems not to like the subquery.
Can anyone suggest how I can achieve this in Esper?
My initial attempt was as follows, but this seems to trigger even if a single sensor activates at a single location:
select location from SensorEntityWithAsset#time(30 sec)
group by location, sensor
having count(sensor) > 1 output every 30 seconds
Many thanks.
The difference between Esper and relational databases is that Esper outputs the result as soon as the third row "Location1 Sensor2" shows up. It doesn't allow subqueries in the from-clause as the from-clauses are streams. The subquery can instead be put into the where-clause like so (remove % notebook tags)
%esperepl
create schema Event(location string, sensor string);
select * from Event as incoming
where exists (
select * from Event#firstunique(location) as existingEvents
where existingEvents.sensor != incoming.sensor
)
Events ...
%esperscenario
Event={location='1', sensor='A'}
Event={location='1', sensor='A'}
Event={location='1', sensor='B'}
In case the idea is to also expire events use Event#firstunique(location)#time(1 hour)
instead.