algorithmgeneratorcombinationspseudocodegray-code

Generate all combinations of a collection in o(1)


Combinations of a collection of int {1,2,3} is:

{},{1},{2},{1,2},{3},{1,3},{2,3},{1,2,3}

The common solution is to track the representative bit-mask of a index.

From my example for numbers in 0-7:

000,001,010,011,100,101,110,111

This algorithm allows an iterative solution, each time the generator goes over each bit and decides whether insert the item or not, so it creates the next combination with a run-time of O(n).

Gray codes:(From Wikipedia) The reflected binary code (RBC), also known as Gray code after Frank Gray, is a binary numeral system where two successive values differ in only one bit

If we just use gray numbers instead we get this range of numbers:

000,001,011,010,110,111,101,100

The result is the same in different order:

{},{1},{1,2},{2},{2,3},{1,2,3},{1,3},{3}

But here we see that the difference is only one item. Is it possible to implement the generator in O(1) using gray code range.

I understand that the returned list will have a O(n) run-time anyway if iterated.


Solution

  • Both the incremental and Gray code solutions are amortized O(1), which I believe is as good as you are going to get. "Amortized O(1)" means that you will occasionally get longer times for a single invocation, but they are sufficiently rare that the total time for k consecutive invocations is O(k) so that the average time for one invocation is O(1).

    That is a stronger claim than "expected O(1)", because the amortized execution time is a guarantee in aggregate over sufficient invocations, whereas expected execution time could turn out to produce an unexpected aggregate time if you are extremely unlucky. (In practice, the difference is usually unimportant, but see, for example, DoS attacks against "expected O(1)" hashtable lookups.)

    To see why the incremental algorithm is amortized O(1), look at the number of bits flipped in each iteration. Every second iteration (where the number ends with a 0), exactly one bit is flipped. Every fourth iteration, exactly two bits are flipped. Every eighth iteration, exactly three bits are flipped. And so on. It is easy to show that over k iterations, a maximum total of 2k +log2 k bits will be flipped.

    For the Gray code algorithm, only one bit is flipped on each iteration, but that bit must be found. A simple iterative implementation of Gray code increment is to maintain the parity of the current selection. Since the parity changes on every iteration, it is not necessary to recompute; it is simply inverted. Then the algorithm alternates between two rules:

    1. If the parity is even, flip the low-order bit.

    2. If the parity is odd, flip the bit to the left of the lowest-order 1 bit.

    Obviously, option 1 applies to half of the iterations, and it clearly examines only one bit. For option 2, we need to figure out how many bits are examined on each iteration. Since each possible combination of bits occurs exactly once, the number of trailing zeros obeys the same frequency as an incremental sequence of integers: every second time, there are no trailing zeros, every fourth time there is one, every eighth time there are two, and so on.

    So, in the end, we examine the same number of low-order bits in both solutions. However, the Gray code solution changes only one element of the set on each iteration, while the incremental solution changes an amortized average of two. So if set modification is expensive, the Gray code algorithm may be superior.


    Afterthoughts

    1. You can apply the Gray code algorithm directly to a set without passing through a bit string but it's performance will depend on some details of the set implementation. The parity continues to alternate as above (it is also the size of the set modulo 2, so if your set has an o(1) size operation, you could use that instead of maintaining state). We assume that you have some o(1) function of mapping each possible set element to its successor (in the universe, not the selection), which could be similar to the way bits are mapped to set elements. Then the two possible algorithm actions are:

      1. Even parity: if the smallest element in the set is the smallest element in the universe, remove that element; otherwise add it.

      2. Odd parity: if the successor of the smallest element in the set is also present, remove it; otherwise, add it.

    2. If you need to create a new set on each iteration, rather than modifying the current selection set, then of course you cannot do better than O(n), because the amortized average size of the selection is n/2, and creating a set cannot take less time than its size.