I want to understand the "dos" and "dont dos" in siddhi. I saw DB connectors and possibilities to enrich stream events with data from DB (lets say the cassandra connector).
Example:
@primaryKey('id')
@store(type = 'rdbms', datasource = 'WSO2_TEST_DB')
define table BuyerInfoTable (id string, name string, address string, email string);
@info(name = 'EnrichBuyerInformation')
from ShipmentInfoStream as s join BuyerInfoTable as b
on s.buyerId == b.id
select s.orderId, b.name, b.address, b.email, s.shipmentType
insert into ShipmentAndBuyerInfoStream;
Do I understand right, that this approach would mean that there is a select query made to the db on each incoming Event on the ShipmentInfoStream ? In case yes - this sounds like a "dont do" for me - especially if we are talking about 100k events / sec.
Or am I understanding the architecture in a wrong way?
Yes, you are correct. As per the above query, when there is an event arrived in ShipmentInfoStream then there will be a DB query, get the output and process further.
But, this operation could be improved in various ways.
If the DB table only contains limited values (and not changed by external users) then you could pre-load those events, keep in the in-memory event store and process them.
You could use a cache to improve the performance. Check the "Caching In Memory" section in https://siddhi.io/en/next/docs/query-guide/#store ...