I would like to understand how to create a B+ Tree with order(branching factor) of 3 and maximum number of entries in a node of 3. I searched a lot for applets, but most of them don't work properly, and the one that's great seems not to follow these steps I found on Wikipedia.
Following these steps
I believe the insertion of the new value should happen before step 4. Is there a better described version of this algorithm?
With 20,15,5,1,3,9,2,12
as an input, I obtain the following tree:
|1|5| |
|2|5| | |9| | |
|1|2| | |3|5| | |9| | | |15|20| |
Is it correct according to those steps? Could anyone point out an applet to verify this example?
Your tree is not correct. Each value in a node (not leaf) should be a break-point for the branches. To illustrate this let us consider the following node:
----------------------------------------
| 7 | 23 |
----------------------------------------
| pointer to | pointer to | pointer to |
| branch with| branch with| branch with|
| values | values | values |
| < 7 | 7 <= x < 23| >= 23 |
----------------------------------------
This node has 2 values and three branches. The values 7 and 23 represent the smallest values in the second and third branch. The smallest value of the first branch is not represented at this level. (Unless it is the smallest value in the whole tree, it will be on some higher level.)
With b=4 the rules for inserting a value can be summarized:
Let us consider a tree with the numbers 1..9:
3,5,7
|----------------|
1,2 3,4 5,6 7,8,9
If we now insert number 10 into the tree, the rightmost leaf becomes too full (7,8,9,10) so that it has to be split into two leaves (7,8) and (9,10). As per the rule, number 9 (lowest value of the upper split bucket) is sent to the parent:
3,5,7,9
|---------------------|
1,2 3,4 5,6 7,8 9,10
This makes the parent full, and it has to be split:
3,5 7,9
|-------| |---|
1,2 3,4 5,6 7,8 9,10
When a parent node is split, the first value of the new node is sent to its parent. In this tree the new node is (7,9) and thus the value to be removed and sent to the parent is 7. As there is no such parent, a new root node is created:
7
|---------|
3,5 9
|-------| |---|
1,2 3,4 5,6 7,8 9,10
Let us build a tree with numbers 20, 15, 5, 1, 3, 9, 2, 12 and b = 4
First three values fit in one leaf (which is at the same time the root node):
5,15,20
When the number 1 is inserted, the bucket splits and the first value of the new bucket is sent to the parent (new root):
15
|-----|
1,5 15,20
It should be noted that nothing is ever removed from the leaf nodes. The removal occurs only in parent nodes which split.
The value 3 can be inserted into its bucket with no problems (the bucket will become 1,3,5). However, trying to insert number 9 makes the bucket overfull (1,3,5,9) and it will split. The first value of the new bucket (5) will be inserted into the parent.
5,15
|----------|
1,3 5,9 15,20
Values 2 and 12 fit in their buckets without splitting so that the tree is:
5,15
|--------------|
1,2,3 5,9,12 15,20
To see what happens when a middle node splits, let us consider the following tree:
26
|-----------------------------|
8,14,20 30,34
|--------------------------| |-----------|
2,4,6 8,10,12 14,16,18 20,22,24 26,28 30,32 34,36
Now we will add value 13 into the tree. This will trigger splitting the bucket (8,10,12,13) into two:
26
|-----------------------------------|
8,12,14,20 30,34
|-------------------------------| |-------------|
2,4,6 8,10 12,13 14,16,18 20,22,24 26,28 30,32 34,36
There are too many children for the middle left node (8,12,14,20), so it has to be split:
26
|---------------------------------------|
8,12 14,20 30,34
|-------------| |---------| |-------------|
2,4,6 8,10 12,13 14,16,18 20,22,24 26,28 30,32 34,36
As we split a parent node, we have to apply the added rule that the first item of the new bucket has to be inserted into the parent and removed from the node, i.e. 14 is taken away from (14,20):
14,26
|------------------------------------|
8,12 20 30,34
|-------------| |---------| |-------------|
2,4,6 8,10 12,13 14,16,18 20,22,24 26,28 30,32 34,36
The tree also serves to illustrate the rule: each parent node carries the lowest value of each subtree except for the first subtree.
The example in the question should result (if I have not made too many mistakes):
5
|----------|
3 15
|---| |-------|
1,2 3 5,9,12 15,20