I have a potential performance/memory bottleneck when I try to calculate insurance premium using Drools engine.
I use Drools in my project to separate business logic from java code and I decided to use it for premium calculation too.
Details below:
I have to calculate insurance premium for given contract.
Contract is configured with
At the moment, premium is calculated using this formula:
premium := SI * px * (1 + py) / pz
where:
With R1, R2 and R3 implemented I have java code in separation from business logic, and any business analyst (BA) may modify formula and add new dependencies without redeploys.
I have contract domain model, which consists of classes Contract, Product, Client, Policy and so on. Contract class is defined as:
public class Contract {
String code; // contractCode
double sumInsured; // SI
String clientSex; // M, F
int professionCode; // code from dictionary
int policyYear; // 1..5
int clientAge; //
... // etc.
In addition I introduced Var class that is container for any parameterized variable:
public class Var {
public final String name;
public final ContractPremiumRequest request;
private double value; // calculated value
private boolean ready; // true if value is calculated
public Var(String name, ContractPremiumRequest request) {
this.name = name;
this.request = request;
}
...
public void setReady(boolean ready) {
this.ready = ready;
request.check();
}
...
// getters, setters
}
and finally - request class:
public class ContractPremiumRequest {
public static enum State {
INIT,
IN_PROGRESS,
READY
}
public final Contract contract;
private State state = State.INIT;
// all dependencies (parameterized factors, e.g. px, py, ...)
private Map<String, Var> varMap = new TreeMap<>();
// calculated response - premium value
private BigDecimal value;
public ContractPremiumRequest(Contract contract) {
this.contract = contract;
}
// true if *all* vars are ready
private boolean _isReady() {
for (Var var : varMap.values()) {
if (!var.isReady()) {
return false;
}
}
return true;
}
// check if should modify state
public void check() {
if (_isReady()) {
setState(State.READY);
}
}
// read number from var with given [name]
public double getVar(String name) {
return varMap.get(name).getValue();
}
// adding uncalculated factor to this request – makes request IN_PROGRESS
public Var addVar(String name) {
Var var = new Var(name, this);
varMap.put(name, var);
setState(State.IN_PROGRESS);
return var;
}
...
// getters, setters
}
Now I can use these classes with such flow:
request = new ContractPremiumRequest(contract)
state == INIT
px = request.addVar( "px" )
Var("px")
with ready == false
state == IN_PROGRESS
py = request.addVar( "py" )
px.setValue( factor )
, px.setReady( true )
px
ready == true
request.check()
makes state == READY
if ALL vars are readyI have created 2 DRL rules and prepared 3 decision tables (px.xls, py.xls, ...) with factors provided by BA.
Rule1 - contract_premium_prepare.drl:
rule "contract premium request - prepare dependencies"
when
$req : ContractPremiumRequest (state == ContractPremiumRequest.State.INIT)
then
insert( $req.addVar("px") );
insert( $req.addVar("py") );
insert( $req.addVar("pz") );
$req.setState(ContractPremiumRequest.State.IN_PROGRESS);
end
Rule2 - contract_premium_calculate.drl:
rule "contract premium request - calculate premium"
when
$req : ContractPremiumRequest (state == ContractPremiumRequest.State.READY)
then
double px = $req.getVar("px");
double py = $req.getVar("py");
double pz = $req.getVar("pz");
double si = $req.contract.getSumInsured();
// use formula to calculate premium
double premium = si * px * (1 + py) / pz;
// round to 2 digits
$req.setValue(premium);
end
Decision table px.xls:
Decision table py.xls:
KieContainer is constructed once on startup:
dtconf = KnowledgeBuilderFactory.newDecisionTableConfiguration();
dtconf.setInputType(DecisionTableInputType.XLS);
KieServices ks = KieServices.Factory.get();
KieContainer kc = ks.getKieClasspathContainer();
Now to calculate premium for given contract we write:
ContractPremiumRequest request = new ContractPremiumRequest(contract); // state == INIT
kc.newStatelessKieSession("session-rules").execute(request);
BigDecimal premium = request.getValue();
This is what happens:
ContractPremiumRequest[INIT]
Var
objects)ContractPremiumRequest[READY]
and use formulaFirst calculation, which loads and initializes decision tables takes ~45 seconds – this might become problematic.
Each calculation (after some warmup) takes ~0.8 ms – which is acceptable for our team.
Heap consumption is ~150 MB – which is problematic as we expect much more big tables will be used.
========== EDIT (after 2 years) ==========
This is a short summary after 2 years.
Our system has grown very much, as we expected. We have ended with more then 500 tables (or matrices) with insurance pricing, actuarial factors, coverage configs etc. Some tables are more than 1 million rows in size. We used drools but we couldn't handle performance problems.
Finally we have used Higson engine (http://higson.io) - former name was Hyperon.
This system is a beast - it allows us to run hundreds rule matches in approx 10 ms total time.
We were even able to trigger full policy recalculation on every KeyType event on UI fields.
Higson uses fast in-memory indexes for each rule table and these indexes are compacted so they offer almost no memory footprint.
We have one more benefit now - all pricing, factors, config tables can be modified on-line (both values and structure) and this is fully transparent to java code. Application just continues to work with new logic, no development or restart is needed.
I have found some comparison made by our team a year ago - it shows engine initialization (drools/hyperon) and 100k simple calculations from jvisualVM perspective:
The problem is that you have created a huge amount of code (all the rules resulting from the tables) for what is a relatively small amount of data. I have seen similar cases, and they all benefited from inserting the tables as data. PxRow, PyRow and PzRow should be defined like this:
class PxRow {
private String gender;
private int age;
private double px;
// Constructor (3 params) and getters
}
Data can still be in (simpler) spreadsheets or anything else you fancy for data entry by the BA boffins. You insert all rows as facts PxRow, PyRow, PzRow. Then you need one or two rules:
rule calculate
when
$c: Contract( $cs: clientSex, $ca: clientAge,
$pc: professionCode, $py: policyYear,...
...
$si: sumInsured )
PxRow( gender == $cs, age == $ca, $px: px )
PyRow( profCode == $pc, polYear == $py,... $py: py )
PzRow( ... $pz: pz )
then
double premium = $si * $px * (1 + $py) / $pz;
// round to 2 digits
modify( $c ){ setPremium( premium ) }
end
Forget the flow and all the other decorations. But you may need another rule just in case your Contract doesn't match Px or Py or Pz:
rule "no match"
salience -100
when
$c: Contract( premium == null ) # or 0.00
then
// diagnostic
end