algorithmmath

Effective algorithm for generation TREE(3) sequence


TREE(N) is well-known mathematical function, describes maximum number of possibles trees with N different-colours nodes, where following tree does not include any of previous trees as subtree, and any i-th tree contains no more than i nodes

Information above is well-known and wide-spread in web in many articles and videos, but there is no effective algorithm for generation i-th tree in sequence cannot be found anywhere

Many articles provide following image with first 19 elements of TREE(3) sequence, but it's unclear, how to programmatically generate this? One can suppose that first N elements had been generated brute-forcefully, but it cannot be true since wrongly generated i-th element can affect correctness of subsequent i+M elements. So there is surely exists some algorithm to generate i-th element of TREE(3) sequence

enter image description here

For convenience elements of TREE(3) sequence can be represented as bracket expressions, there curvy braces { } are represent tree nodes of first color (red on image), square braces [ ] are represent tree nodes of second color (blue on image) and round braces ( ) represent tree nodes of third color (green on image)

Text notation for TREE(3) sequence from above image will be following:
{}
[[]]
[()()]
[((()))]
([][(())])
((())[(())])
(()()()[(())])
(()()[(())])
(()[(())])
([(())])
[(())]
([()][()][()][()][()][])
and so on

Every following tree in sequence does not contain any part of previous tree, and any i-th string in notation above does not content any substring from previous 0...(i-1)-th strings

So questing is how to generate these trees, or representing strings programmatically?


Solution

  • I had the same question and the only resource on it appears to be your question. Generally, proving existence and finiteness of something is much simpler than actually finding any solution.

    An algorithm that constructs a sequence may be a brute-force approach that recursively constructs the sequence for the first i trees until it terminates, then goes back to i-1, until it exhausts all possibilities there, then i-2 and so on. Until it has constructed all possible trees that can be built with three seeds, i.e., it has exhausted all possible combinations starting at sequence i=1. The final sequence will then be the longest path and its length the actual solution to TREE(3).

    For TREE(2), we can build the sequences:

    1. []
    2. ()
    3. Terminates -> sequence ends at 2.

    Next attempt:

    1. []
    2. (())
    3. ()
    4. Terminates -> sequence ends at 3.

    So the largest possible solution is 3 (up to relabeling), but I don't see any way to find this from knowing the first sequence which terminates earlier.

    Likewise, for TREE(3) we can easily find sequences that terminate early:

    1. {}
    2. []
    3. ()
    4. Terminates -> length 3

    Next attempt:

    1. {}
    2. []
    3. (())
    4. ()
    5. Terminates -> length 4

    After a few more attempts:

    1. {}
    2. (())
    3. [[[]]]
    4. [[]]
    5. []
    6. ()
    7. Terminates -> length 6

    Is there any algorithm smarter than the brute-force approach that tries all possible sequences? (It is possible that the final two elements are indeed () and [], but even working in both directions won't get us anywhere.) Is the "final" sequence of TREE(3) even unique? And even if it is, is there a way to construct the sequence without knowing later elements?

    I haven't found any information about uniqueness of the final sequence, but I am pretty sure it is impossible to construct the "final" sequence directly, given the fact that we have an abundance of dead ends on the way. As you have pointed out, any given element in a sequence influences the correctness of later elements. Thus, it appears impossible to outperform the brute-force algorithm on a deterministic Turing machine. (And if you could, you'd probably rather focus your attention on the much simpler special case of P=NP for large-number factorization.)

    An interesting follow-up question is then whether the order-of-magnitude estimates for TREE(3) that are commonly quoted are actually correct. These estimates imply that there are no "natural dead ends" which make the construction of trees impossible beyond a certain size. There might be a peculiar property of graph shapes that appears at g(3)-37 which impedes construction of longer sequences. There are other examples of conjectures that appear to be true for small numbers, but fail for larger numbers (e.g., Euler's Sum of Powers Conjecture or the Pólya Conjecture). So I don't see any reason why tree sequences shouldn't suffer from any "natural dead ends" along the way.