jdbctidbtikv

Why the TiDB performance drop for 10 times when the updated field value is random?


I set up the TiDB, TiKV and PD cluster in order to benchmark them with YCSB tool, connected by the MySQL driver. The cluster consists of 5 instances for each of TiDB, TiKV and PD. Each node run a single TiDB, TiKV and PD instance.

However, when I play around the YCSB code in the update statement, I notice that if the value of the updated field is fixed and hardcoded, the total throughput is ~30K tps and the latency at ~30ms. If the updated field value is random, the total throughput is ~2k tps and the latency is around ~300ms.

The update statement creation code is as follow:


  @Override
  public String createUpdateStatement(StatementType updateType) {
    String[] fieldKeys = updateType.getFieldString().split(",");
    StringBuilder update = new StringBuilder("UPDATE ");
    update.append(updateType.getTableName());
    update.append(" SET ");
    for (int i = 0; i < fieldKeys.length; i++) {
      update.append(fieldKeys[i]);
      String randStr = RandomCharStr();  // 1) 3K tps with 300ms latency
      //String randStr = "Hardcode-Field-Value";  // 2) 20K tps with 20ms latency
      update.append(" = '" + randStr + "'");
      if (i < fieldKeys.length - 1) {
        update.append(", ");
      }
    }
    // update.append(fieldKey);
    update.append(" WHERE ");
    update.append(JdbcDBClient.PRIMARY_KEY);
    update.append(" = ?");
    return update.toString();
  }

How do we account for this performance gap? Is it due to the DistSQL query cache, as discussed in this post?


Solution

  • I manage to figure this out from this post (Same transaction returns different results when i ran multiply times) and pr (https://github.com/pingcap/tidb/issues/7644). It is because TiDB will not perform the txn if the updated field is identical to the previous value.