algorithmdata-structuresbinary-treebinary-search-tree# Convert a sorted array into a height-balanced binary search tree by picking middle element -- why does it work?

## Lemma: *(∀k∈ℕ)(max(2ᵏ - 1) = min(2ᵏ - 1) = k)*

## Theorem: *(∀n∈ℕ)(max(n) = ⌈log₂(n + 1)⌉* and *min(n) = ⌊log₂(n + 1)⌋)*

I was doing an exercise to convert a sorted array to a binary search tree where every node in the tree has its children whose heights at most differ by 1.

One simple solution is to pick out the middle element to be the root. Middle as in (0+size) integer divided by 2 and then make the left, right children by a recursive call on the left half, right half of the remaining array respectively.

And one can check that the solution works via some testing. (Example code in C++ attached at the end.)

However, I am having trouble proving *why* this solution works.

Why?

**I tried** reasoning inductively: for any node with height K>3, assume all height k<K nodes are 1) height balanced, and 2) if k−2≥0, then the max possible size of k−2 height nodes + 1 is less than the min size of height k nodes

Since max size is strictly increasing on height, (2) means if height difference ≥2 then the size difference is ≥2.

It's easy enough to prove that any node with height K is height balanced with the inductive assumption. But I am having trouble proving part (2) for K.

Below is a size chart and it also has the base case.

height | min size | max size |
---|---|---|

0 | 0 | 0 |

1 | 1 | 1 |

2 | 2 | 3 |

3 | 4 | 7 |

4 | 7 | 15 |

The min and max sizes have the following recurrence relations:

min size of height K nodes = 1 + min size of height K−1 nodes + the min size of height K−2 nodes = Fib(n)+n *

max size of height K node = max size of K−1 node x 2 + 1 =2^(n-1) − 1

* last equal sign is "probably"

Fib(n)+n *seems* greater than 2^(n-1), especially with the rounding relation shown on its Wikipedia page. But then I'd be left with proving this rather involved inequality, going through the rounding relation and such.

Is there something simple that I am missing?

Is there an easier way to prove the correctness of the solution altogether?

```
struct TreeNode {
int val;
TreeNode* left;
TreeNode* right;
TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
};
void deleteTree(TreeNode* root) {
if (root) {
if (root->left) {
deleteTree(root->left);
}
if (root->right) {
deleteTree(root->right);
}
delete root;
}
}
// one can print the tree to Graphviz script for example for inspection; my code for that has a lot of other things together so I'll skip that
// technically meant to be a different file below
#include<vector>
#include<iostream>
#define DEBUG
TreeNode* sortedArrayToBST(std::vector<int>& nums, std::size_t i, std::size_t j) {
#ifdef DEBUG
std::cout << "i = " << i << "\tj = " << j << '\n';
#endif // DEBUG
if (i >= j) {
return nullptr;
}
else {
std::size_t k{ (i + j) / 2 };
#ifdef DEBUG
std::cout << "\tk = " << k << '\n';
#endif // DEBUG
TreeNode* node{ new TreeNode(nums[k]) };
node->left = sortedArrayToBST(nums, i, k);
node->right = sortedArrayToBST(nums, k+1, j);
return node;
}
}
TreeNode* sortedArrayToBST(std::vector<int>& nums) {
return sortedArrayToBST(nums, 0, nums.size());
}
int main() {
std::vector<int> q1{ -4, -3, 0, 1, 2 };
TreeNode* node{ sortedArrayToBST(q1) };
deleteTree(node);
}
```

Solution

Let's consider the easy case first.

Write the maximum depth of a leaf for a tree from array length *n* as
*max(n)*, and the minimum as *min(n)*. Let's define the height of the empty tree
as zero, the height of a tree with one element as 1, and so on (this is 1 more
than a conventional definition of tree height, but it makes things cleaner.)

With that out of the way:

By induction on *k*:

Base case: *k = 0* gives *n = 0*, which is the empty tree, and *max(0) = min(0)
= 0*.

Induction: Let *k∈ℕ-{0}* be given. Consider the array of length *n = 2ᵏ - 1*. The
algorithm takes one element to be the center node, then recurses on subarrays each of
length *(2ᵏ - 2) / 2 = 2ᵏ⁻¹ - 1*. By inductive hypothesis, *max(2ᵏ⁻¹ - 1) =
min(2ᵏ⁻¹ - 1) = k - 1*. So *max(2ᵏ - 1) = min(2ᵏ - 1) = k).

Now consider the less round case.

Base case: As before.

Induction: Let *n∈ℕ-{0}* be given. Let *k = ⌊log₂(n + 1)⌋*.

If *k = log₂(n + 1)*, then *n = 2ᵏ - 1*, so by the lemma we have *max(n) =
min(n) = log₂(n + 1) = ⌈log₂(n + 1)⌉ = ⌊log₂(n + 1)⌋* and we're done.

OTHERWISE, *n ≥ 2ᵏ* and *n ≤ 2ᵏ⁺¹ - 2* (otherwise we could get an
immediate contradiction on the definition of *k*).

The algorithm operating on an array of length *n* will take one element to be
the root and recurse on children of length *⌊(n - 1)/ 2⌋* and *⌈(n - 1)
/ 2⌉*, which lengths we'll define as *l* and *r* respectively.

Note *max* and *min* are nondecreasing so *min(l) ≤ min(r)* and *max(l) ≤
max(r)*. We need only check *min(l)* and
*max(r)*.

*l = ⌊(n - 1) / 2⌋ ≥ (n - 2) / 2 = n / 2 - 1* so
*min(l) ≥ ⌊log₂(n / 2)⌋*,
and we know that *n ≥ 2ᵏ* so we have
*min(l) ≥ ⌊log₂(2ᵏ / 2)⌋ = k - 1*.

*r = ⌈(n - 1) / 2⌉ ≤ n / 2* so
*max(r) ≤ ⌊log₂(n / 2 + 1)⌋*, and
we have *n ≤ 2ᵏ⁺¹ - 2* so we get
*max(r) ≤ ⌊log₂((2ᵏ⁺¹ - 2)/2 + 1)⌋ = ⌊log₂((2ᵏ⁺¹)/2)⌋ = k*.

Thus we have *min(n) ≥ k* and *max(n) ≤ k + 1*,

and of course we know *min(n) <
max(n)* since *min(n) = max(n)* only in the balanced case
we already ruled out above, so *min(n) = k* and *max(n) = k + 1*.

- Difference between back tracking and dynamic programming
- How can we optimize this algorithm from O(N ** 2) to better complexity in order to pass performance test?
- How do I predict the required size of a Base32 Decode output?
- Reversing AND Bitwise
- Why does my binary search need an extra comparison? log2(N)+1
- How to build a trie for finding exact phonetic matches, sorted globally by weight, and paginated? (Building off this example)
- What is the Time Complexity and Space Complexity of extending a string according to a rule?
- Skyscraper puzzle algorithm
- Check if all elements in a list are equal
- Bitwise Interval Arithmetic
- fast algorithm for drawing filled circles?
- How to find distance from the latitude and longitude of two locations?
- Determine if two rectangles overlap each other?
- Randomly Splitting a Graph according to Conditions
- Maximize distance while pushing crates
- Free a binary tree without recursion
- How can I estimate number of nodes in given tree structure?
- Explanation of class Definition for Binary Trees in leetcode
- Procedural Generation of 2D Rooms
- Is there an algorithm to find the closest element to X in an unsorted array in Ω(logN)?
- Advanced Java idiom for 2+ classes implementing a generic interface
- Is there any algorithm in c# to singularize - pluralize a word?
- Number of Common sub sequences of two strings
- Trying to solve problem 19 on Euler Project
- is a "non-decreasing" sequence "increasing"?
- Is it possible to get the original value of a number, after several multiplications **with overflow**?
- Algorithm to determine the highest and lowest possible finishing position of a team in a league
- Algorithm to calculate the number of divisors of a given number
- Rolling or sliding window iterator?
- best way to pick a random subset from a collection?