In Database System Concepts, 6th edition, chapter 11, in "Figure 11.11 Querying a B+-tree.", there is a procedure printAll(value V)
. It is used to print all records with search key value V (there can be duplicates). It calls find(V)
, which returns the first leaf node that has key V, and the first index i in that leaf that has that key.
Why doesn't the code include Set i = 1
when i > number of keys in L
?
procedure printAll(value V)
/* prints all records with search key value V */
Set done = false;
Set (L,i) = find(V);
if ((L,i) is null) return
repeat
repeat
Print record pointed to by L.Pi
Set i = i + 1
until (i > number of keys in L or L.Ki > V)
if (i > number of keys in L)
then L = L.Pn // No need to set i to 1?
else Set done = true;
until (done or L is null)
You are absolutely right. i
needs to be reset to 1 when moving to the next leaf in the B+ tree.
It is also not corrected in the "Errata and Updates For Database System Concepts, 6th Edition"
There is a C++ implementation on github, inspired by this book, which does reset i as it should have been. Of course, in C++ indexes are zero-based:
void BPlusTree::printAll(string v) {
//
// B+ tree described in the book may contain duplicate keys
// printAl prints all the records with key v
// but here I modified it to print all records with key >= v
//
bool done = false;
auto pair = find(v);
auto p = pair.first;
auto i = pair.second;
if (i == -1) { // not found
cout << "no such record" << endl;
return;
}
cout << "i is " << i << endl;
while (!done and p != NULL) {
// be careful with special cases, may encounter NULL pointer, size is also a problem (#pointer = #key + 1)
for (; (i < p->_ptrs.size() && i < p->_keys.size() && v <= p->_keys[i]) or (i == p->_keys.size()); i++) {
if (p->_ptrs[i] == NULL) {
continue;
}
cout << "record: " << *reinterpret_cast<int*>(p->_ptrs[i]) << endl;
}
if (i >= _num_of_keys(p)) {
p = p->_next; // move to next leaf node
i = 0; // adjust i
}
else {
done = true;
}
}
}