algorithmtime-complexitybig-osatisfiability

Algorithms with superexponential runtime?


I was talking with a student the other day about the common complexity classes of algorithms, like O(n), O(nk), O(n lg n), O(2n), O(n!), etc. I was trying to come up with an example of a problem for which solutions whose best known runtime is super-exponential, such as O(22n), but still decidable (e.g. not the halting problem!) The only example I know of is satisfiability of Presburger arithmetic, which I don't think any intro CS students would really understand or be able to relate to.

My question is whether there is a well-known problem whose best known solution has runtime that is superexponential; at least ω(n!) or ω(nn). I would really hope that there is some "reasonable" problem meeting this description, but I'm not aware of any.


Solution

  • Maximum Parsimony is the problem of finding an evolutionary tree connecting n DNA sequences (representing species) that requires the fewest single-nucleotide mutations. The n given sequences are constrained to appear at the leaves; the tree topology and the sequences at internal nodes are what we get to choose.

    In more CS terms: We are given a bunch of length-k strings that must appear at the leaves of some tree, and we have to choose a tree, plus a length-k string for each internal node in the tree, so as to minimise the sum of Hamming distances across all edges.

    When a fixed tree is also given, the optimal assignment of sequences to internal nodes can be determined very efficiently using the Fitch algorithm. But in the usual case, a tree is not given (i.e. we are asked to find the optimal tree), and this makes the problem NP-hard, meaning that every tree must in principle be tried. Even though an evolutionary tree has a root (representing the hypothetical ancestor), we only need to consider distinct unrooted trees, since the minimum number of mutations required is not affected by the position of the root. For n species there are 3 * 5 * 7 * ... * (2n-5) leaf-labelled unrooted binary trees. (There is just one such tree with 3 species, which has a single internal vertex and 3 edges; the 4th species can be inserted at any of the 3 edges to produce a distinct 5-edge tree; the 5th species can be inserted at any of these 5 edges, and so on -- this process generates all trees exactly once.) This is sometimes written (2n-5)!!, with !! meaning "double factorial".

    In practice, branch and bound is used, and on most real datasets this manages to avoid evaluating most trees. But highly "non-treelike" random data requires all, or almost all (2n-5)!! trees to be examined -- since in this case many trees have nearly equal minimum mutation counts.