MySQL official documentation 8.8.5 Estimating Query Performance offers the way to count disk seeks:
log(row_count) / log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) + 1
In MySQL, an index block is usually 1,024 bytes and the data pointer is usually four bytes. For a 500,000-row table with a key value length of three bytes (the size of MEDIUMINT), the formula indicates log(500,000)/log(1024/3*2/(3+4)) + 1 = 4 seeks.
Innodb index is implemented by B-Tree, and I understand the formula. But there are two key points I cannot catch on in the sentence In MySQL, an index block is usually 1,024 bytes and the data pointer is usually four bytes.
My questions are:
First, please note that the calculation you are looking at is for MyISAM, not for InnoDB.
The documentation has a description of the MyISAM index file format, including references to the source code.
"index_block_length" refers to the block size:
keydef->block_length 2 04 00 that is, 1024
It is specified by the system configuration variable myisam-block-size
, defaulting to the mentioned 1024 bytes:
The block size to be used for MyISAM index pages.
The index entries look like this:
(first key value) 2 01 31 Value is "1" (0x31).
(first key pointer) 4 00 00 00 00 Pointer is to Record
#0000.
(second key value) 2 01 33 Value is "3" (0x33).
(second key pointer) 4 00 00 00 02 Pointer is to Record
#0002.
where "data_pointer_length" is the length of a "key pointer". On disk, so it has has nothing to do with using 32- or 64-bit memory pointers.
The size is set by the myisam_data_pointer_size
configuration variable. You may notice that the default size is 6 nowaways, not 4. And with "nowaydays" I mean since MySQL 5.0.6, released in 2005. Which might give you an idea why the documentation you looked at didn't mention they are not actually talking about InnoDB.