mysqlhashinnodbpartitioning

Why does MySQL InnoDB partitioning use modulo-based hashing instead of consistent hashing?


I’ve been digging into MySQL (both v5.7 and 8.4) InnoDB’s partitioning and noticed that PARTITION BY HASH just uses good ol’ modulo arithmetic to split data (e.g., id % number_of_partitions). That got me wondering, why not use consistent hashing, which is pretty common in distributed systems to avoid reshuffling when scaling up/down?

I get that MySQL is usually a single-node DB, but still:

  1. Is there a technical reason for sticking with modulo (I couldn't find any explicit reason in their official document)? Maybe simplicity, performance, or something else?
  2. Are there alternatives in MySQL that act more like consistent hashing? (I checked the docs but didn’t spot anything obvious.)
  3. What happens if I change the partition count? Does InnoDB have to reshuffle everything, or is there some optimization?

Just curious if this was a deliberate design choice or just a "good enough" solution for most cases. Thanks!


Solution

  • Is there a technical reason for sticking with modulo (I couldn't find any explicit reason in their official document)? Maybe simplicity, performance, or something else?

    Because changing the way HASH partitioning works would break millions of databases that currently use HASH partitioning if they try to upgrade.

    What happens if I change the partition count? Does InnoDB have to reshuffle everything, or is there some optimization?

    In modulus-based hashing, if you add the nth partition, in theory only 1/n of the rows in each existing partition need to move to the new partition. But this would leave gaps in each partition where rows were removed, and I think most users would like to defragment when that happens.

    The way MySQL implements ALTER TABLE, it just does a full table restructure of all the partitions. This accomplishes the re-partitioning, and defragments all partitions in the process, saving space.

    I suppose they could implement a new partitioning method for consistent hashing. But I don't think it would be an advantage over the modulus hashing algorithm. Consistent hashing does not avoid all reshuffling. It also moves 1/n of the entries if you add a partition, and MySQL would probably still use a table-restructure method to defragment.

    Consistent hashing is in this way not much different from modulus-based hashing. Consistent hashing would be an advantage over some less predictable hashing method, where it's less clear where given rows should end up. If you use some arbitrary hashing function H, then you would definitely have to re-calculate the partition for every row when you add a partition.

    Are there alternatives in MySQL that act more like consistent hashing? (I checked the docs but didn’t spot anything obvious.)

    LIST or RANGE partitioning is a way for you to define the partitions so you don't have to touch other partitions if you do an alteration to add or split a partition.

    This is NOT like consistent hashing, but I think it would accomplish what you seem to want, which is to change partitioning without a full table restructure.