javasqldatabasepostgresqlspring-jdbc

The most efficient way to check data existence in db with spring jdbc template


I'm working on a highload service in which I'm recieving batch messages from kafka (about 500 at once). Those messages represent entities in db. To simplify, lets imagine entity A like this:

create table A (
id varchar references B(id),
serial_num varchar,
is_actual boolean
)

Where a combination of id and serial num is unique.

So when the array of messages recieved from kafka - I'm parsing those messages to entities and then I need to check if such an entity already exists in table A by id and serial_num and if it is not, then I need to update all entities by id and set is_actual to false, otherwise do nothing. After updating - I need to insert new entity. So, for updating and inserting I am using Spring Jdbc Template butchUpdate() method. But how to check if the entities are exist in most efficient way? If I will check them one by one - then I'm losing the adventage of getting message batch from kafka.


Solution

  • So the part you want to optimize is the querying.

    Just construct a single SQL query with a giant WHERE clause where you test many many keys at the same time.

    In pseudo code:

    // some pojo / record that represents the key:
    record Key(String id, String serial)   
    
    List<Key> keys = ...
    StringBuilder sql = new StringBuilder("")   
       for(key in keys) { // don't need key just need iterate keys.size times
          if(!sql.isEmpty()) sql.append(" OR ")
          sql.append("(id=? and serial_num=?)")
       }
       sql.insert(0, "Select id, serial_num from A where ")
       Object[] args = flatten keys so args are keys[0].id, keys[0].serial, keys[1].id, keys[1].serial, ...
       List<Key> keysInDb = jdbcTemplate.query(sql.toString, RowMapper<T> rowMapper, args)
       List<key> missingKeys = keys.removeAll(keysInDb)
    

    And don't forget to put in index on the (id, serial_num) columns.