I set up the TiDB, TiKV and PD cluster in order to benchmark them with YCSB tool, connected by the MySQL driver. The cluster consists of 5 instances for each of TiDB, TiKV and PD. Each node run a single TiDB, TiKV and PD instance.
However, when I play around the YCSB code in the update statement, I notice that if the value of the updated field is fixed and hardcoded, the total throughput is ~30K tps and the latency at ~30ms. If the updated field value is random, the total throughput is ~2k tps and the latency is around ~300ms.
The update statement creation code is as follow:
@Override
public String createUpdateStatement(StatementType updateType) {
String[] fieldKeys = updateType.getFieldString().split(",");
StringBuilder update = new StringBuilder("UPDATE ");
update.append(updateType.getTableName());
update.append(" SET ");
for (int i = 0; i < fieldKeys.length; i++) {
update.append(fieldKeys[i]);
String randStr = RandomCharStr(); // 1) 3K tps with 300ms latency
//String randStr = "Hardcode-Field-Value"; // 2) 20K tps with 20ms latency
update.append(" = '" + randStr + "'");
if (i < fieldKeys.length - 1) {
update.append(", ");
}
}
// update.append(fieldKey);
update.append(" WHERE ");
update.append(JdbcDBClient.PRIMARY_KEY);
update.append(" = ?");
return update.toString();
}
How do we account for this performance gap? Is it due to the DistSQL query cache, as discussed in this post?
I manage to figure this out from this post (Same transaction returns different results when i ran multiply times) and pr (https://github.com/pingcap/tidb/issues/7644). It is because TiDB will not perform the txn if the updated field is identical to the previous value.