Understanding Grossman & Zeitman's algorithm for computing the Ackermann function?

I've read the paper An inherently iterative computation of Ackermann's function, published by Grossman & Zeitman in which they present an algorithm which optimizes Ackermann's function.

We know that the Ackermann function produces the result in the subsequences A(m,n)

m=0: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,...
m=1: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,...
m=2: 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33,...
m=3: 5, 13, 29, 61, 125, 253, 509, 1021, 2045, 4093, 8189, 16381, 32765, 65533,...
m=4: 13, 65533,...

It's stated that the Next array is to keep track of where we are in each subsequence, and the Goal array is to keep track of where we need to reach before transferring the value just calculated to the next subsequence. And all it does is incrementing 1 to the value:

Ackermann(m, n)
    {next and goal are arrays indexed from 0 to m, initialized so that next[0] 
     through next[m] are 0, goal[0] through goal[m - l] are 1, and goal[m] is -1} 
    repeat
        value ← next[0] + 1 
        transferring ← true 
        current ← 0 
        while transferring do begin
           if next[current] = goal[current] then goal[current] ← value
                                            else transferring ← false
           next[current] ← next[current] + 1
           current ← current + 1 
           end while
    until next[m] = n + 1 
    return value {the value of A(m, n)}
    end Ackermann

I'm finding it hard to understand how does the two arrays indicate where we are/ where to move? I'm having trouble trying to pinpoint what exactly it means when we stop at next[m] = n + 1, why specifically this value? I've tried tracing the algorithm and I'm still lost at how any of it works. Would this algorithm count as a bottom-up implementation?

This is a java code that prints the value, current, and both arrays

import java.util.Arrays;

public class Ack {
    static int arrayAck(int m, int n) {
        //Next array to keep track of where we are in each subsequence, 
        //and Goal array to keep track of where we need to reach before transferring the value just calculated to the next subsequence.
        int[] next = new int[m+1];
        int[] goal = new int [m+1];
        Arrays.fill(next, 0);
        Arrays.fill(goal, 1);
        goal[m] = -1;
        
        int value = next[0] + 1;
        while(true) {
            value = next[0] + 1;
            boolean transferring = true;
            int current = 0;
            
            System.out.println("--");
            System.out.println("Next = " + Arrays.toString(next));
            System.out.println("Goal = " + Arrays.toString(goal));
            System.out.println("Current= " + current);
            System.out.println("Value = " + value);
            while(transferring) {
                if(next[current] == goal[current])
                    goal[current] = value;
                else
                    transferring = false;
                next[current]=next[current]+1;
                current += 1;
                System.out.println("Current= " + current);
                System.out.println("Next = " + Arrays.toString(next));
                System.out.println("Goal = " + Arrays.toString(goal));
            }
            
            if(next[m]==n+1)
                break;
            
        }
        
        return value;
    }
     
    public static void main(String[] args) {
        int m=2,n=2;
        System.out.println("Ackermann value for ("+m+","+n+") = " + arrayAck(m, n));
    }

}

Solution

I've read over the paper to get a sense of what algorithm they're proposing, and it's actually not that bad when you get the hang of it.

The basic idea is the following. We have a two-argument version of the Ackermann function defined as follows:

A(0, n) = n + 1
A(i, 0) = A(i - 1, 1)
A(i, n) = A(i - 1, A(i, n - 1))

The approach the authors suggest is essentially a space-optimized, bottom-up calculation of the Ackermann function. Rather than give the final version of their algorithm, let's see if we can instead derive it from the above rules. We'll imagine filling in a 2D table where each row corresponds to a different value of i and each column corresponds to a value of n. Following the convention from the paper, we'll place the row for i = 0 on top, like this:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0 
i=1
i=2
i=3

Our goal will be to fill in this table. For now, don't worry about the size of the table; we can determine that later.

The first of the three cases of the Ackermann function says that

Rule 1: A(0, n) = n + 1.

We can interpret that to mean

The top row of the table is the sequence of values 1, 2, 3, 4, 5, ...

We could fill that in now, but for reasons that will become clearer later, for now let's hold off on doing so.

The next rule is that

Rule 2: A(i, 0) = A(i - 1, 1).

We can interpret that to mean

The first entry in each successive row of the table is the second (index 1) entry of the row above it.

With these two rules in mind, let's start filling in table entries. We can fill in the slot for A(0, 0) using the first rule: A(0, 0) = 0 + 1 = 1.

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1
i=1
i=2
i=3

Now, let's fill in A(0, 1) = 1 + 1 = 2:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2
i=1
i=2
i=3

Because we've just filled in this table entry, we can use the next rule to fill in the slot for A(1, 0), which will be given by A(0, 1) = 2:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2
i=1  2
i=2
i=3

To make more progress in filling in the row for i = 1, we'll need to pull out the third case of the Ackermann function:

Rule 3: A(i, n) = A(i - 1, A(i, n - 1)).

This rule is dense, but it has a really nice intuition to it. To fill in table slot A(i, n), do the following:

Determine what's in table slot A(i, n - 1). Interpret this number as an index k.
Copy over the kth item in the row above you (that is, position A(i - 1, k) = A(i - 1, A(i, n - 1))).

In our case, suppose we want to fill in A(1, 1). Using the above procedure, we do the following:

Look at table slot A(1, 0) = 2.
Copy over index 2 from row 0 (that is, A(0, 2)).

However, looking at our table, we can see that we haven't yet computed A(0, 2). So let's do that. We compute A(0, 2) = 3 via Rule 1:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3
i=1  2
i=2
i=3

This in turn, by Rule 3, means that A(1, 1) = 3:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3
i=1  2    3
i=2
i=3

And now something interesting happens. We have just filled in A(1, 1), which by Rule 2 lets us fill in A(2, 0) by copying over A(1, 1) there:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3
i=1  2    3
i=2  3
i=3

Let's take stock of what just happened.

We computed A(0, 2) via Rule 1.
Computing another entry of Row 0 gave us a new entry of Row 1 via Rule 3.
Computing another entry of Row 1 gave us a new entry of Row 2 via Rule 2.

In other words, by computing another value in Row 0, we had a ripple effect that propagated downward, allowing us to compute more entries in Row 1, Row 2, etc. So that suggests an approach to take: generate another value in Row 0, which is easy to do using Rule 1, and from there see if we can "ripple" that value higher in the table using Rule 2 and Rule 3.

To see this in action, let's fill in A(0, 3) = 3+1 = 4 using Rule 1:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4
i=1  2    3
i=2  3
i=3

Now, look at Row 1. Rule 3 says that to fill in A(1, 3), we need to first compute A(1, 2) (done - it's 3) and then take column 3 of row 0. We just computed column 3 of row 0, which means that we should copy A(0, 3) into A(1, 2):

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4
i=1  2    3    4
i=2  3
i=3

Now, look at Row 2. We've filled in A(2, 0), so the next one to compute here is A(2, 1). To compute A(2, 1), we first compute A(2, 0) (done - it's 3), so we need to take the entry from column 3 of row 1. We haven't computed that yet, though, so nothing more happens. The ripple stops here.

Let's compute the next entry of Row 0: A(0, 4) = 4+1 = 5. That goes here:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5
i=1  2    3    4
i=2  3
i=3

Looking at Row 1, to fill in A(1, 3), we look at A(1, 2) (which is 4) and copy A(0, 4) over:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5
i=1  2    3    4    5
i=2  3
i=3

Looking at Row 2, to fill in A(2, 2), we look at A(2, 1) (which is 3) and copy A(1, 3) over:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5
i=1  2    3    4    5
i=2  3    5
i=3

Now, look at Row 3. Rule 2 says that A(3, 0) = A(2, 1), so we copy A(2, 1) over:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5
i=1  2    3    4    5
i=2  3    5
i=3  5

And the ripple is done how because that's all the rows of our table.

To see how this system evolves, let's run a few more ripples, one after the other:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5    6
i=1  2    3    4    5    6
i=2  3    5
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5    6    7
i=1  2    3    4    5    6    7
i=2  3    5    7
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5    6    7    8
i=1  2    3    4    5    6    7    8
i=2  3    5    7
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5    6    7    8    9
i=1  2    3    4    5    6    7    8    9
i=2  3    5    7    9
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5    6    7    8    9    10
i=1  2    3    4    5    6    7    8    9    10
i=2  3    5    7    9
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5    6    7    8    9    10   11
i=1  2    3    4    5    6    7    8    9    10   11
i=2  3    5    7    9   11
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5    6    7    8    9    10   11   12
i=1  2    3    4    5    6    7    8    9    10   11   12
i=2  3    5    7    9   11
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1    2    3    4    5    6    7    8    9    10   11   12   13
i=1  2    3    4    5    6    7    8    9    10   11   12   13
i=2  3    5    7    9   11   13
i=3  5   13

You can check these values against Figure 1 in the table in the paper and you'll see that this is a (partial) match of what's going on.

An important detail in the execution of this algorithm is that at each point in time, we only care about the very last item in each of the rows. That item tells us what index needs to get filled in the row below us before we copy the next value over. With that in mind, we could imagine running this algorithm again, but only storing the last item in each row. That might look like this:

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0  1
i=1
i=2
i=3

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0       2
i=1  2
i=2
i=3

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0            3
i=1       3
i=2  3
i=3

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                 4
i=1            4
i=2  3
i=3

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                      5
i=1                 5
i=2       5
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                           6
i=1                      6
i=2       5
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                                7
i=1                           7
i=2            7
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                                     8
i=1                                8
i=2            7
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                                          9
i=1                                     9
i=2                 9
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                                               10
i=1                                          10
i=2                 9
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                                                    11
i=1                                               11
i=2                      11
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                                                         12
i=1                                                    12
i=2                      11
i=3  5

    n=0  n=1  n=2  n=3  n=4  n=5  n=6  n=7  n=8  n=9  n=10 n=11 n=12
i=0                                                              13
i=1                                                         13
i=2                           13
i=3       13

This is a big space savings, and it exposes something interesting about each row. For each row, we just need to store two values:

The number that's stored in that row.
The column index corresponding to where the rightmost number is.

What are the significances of these two values? This first number can be interpreted in two ways. First, it's an actual value of the Ackermann sequence. And second, for rows above the first, it's an indicator that says "what position do we need to be in the row below me in order for me to advance to the next column in my row?" This is value the paper stores in the the goal array.

The second of these numbers in turn can be thought of as saying "what position am I in?," which higher-level rows can then use to determine when to advance their goal values. This is the value the paper stores in the next array.

Rather than walk through the pseudocode from the paper, I'll instead rewrite the algorithm from scratch, ending up with something similar to but not exactly the same as what the paper has. The basic idea is to simulate the above process. We keep track of which column we're in in each of the rows of the table (I've called this positions to make its meaning clearer) and what value is stored in each row of the table (I've called this values).

To simplify the logic of handling the case where some rows have values in them while others don't (see the first few steps in the trace above, for example), and to unify together Rule 2 and Rule 3, I've adopted the following strategy. We'll extend the Ackermann function so that the value of A(i, -1) is defined in the following way:

A(0, -1) = 0
A(i, -1) = 1

We'll then pretend that each row begins in position -1, meaning "right before the first column." The intuition is that row 1, row 2, row 3, etc. are all waiting for an item to appear in position 1 in the row above them, whereas in row 0 the value just before the first should be 0 so that the sequence of values produced in row 0 is the series 0, 1, 2, 3, ... .

With that in mind, here's my implementation:

/**
 * File: Ackermann.java
 * Author: Keith Schwarz (htiek@cs.stanford.edu)
 *
 * An implementation of a variation of Grossman and Zeitman's fast algorithm
 * for computing the Ackermann function. This implementation uses O(i) space
 * to compute A(i, n), and takes O(A(i, n)) time. The only operation required
 * on integers is "increment an integer" and "copy an integer." While this is
 * implemented using vanilla ints, it could easily be extended to use BigInteger
 * types or the like.
 */
public class Ackermann {
    public static int ackermann(int i, int n) {
        /* Bounds-check. */
        if (i < 0 || n < 0) throw new RuntimeException("Invalid arguments.");
    
        /* Positions array stores what column we're in in each row of
         * the table. Initially this is all -1 to signify that we're just
         * before the first column.
         */
        int[] positions = new int[i + 1];
        for (int j = 0; j < positions.length; j++) {
            positions[j] = -1;
        }
        
        /* Values array stores the value currently in the filled column
         * in each row. The value in the zeroth row is set to zero because
         * our algorithm works by repeatedly incrementing its value and
         * we need it to be the case that A(0, 0) = 1. Since we start off
         * with "A(0, -1)" computed, we place a zero there.
         *
         * All the other values are set to 1. This corresponds to the rule
         * that A(i, 0) = A(i - 1, 1), meaning we're waiting for column 1
         * to come up in the level below us.
         */
        int[] values = new int[i + 1];
        for (int j = 1; j < values.length; j++) {
            values[j] = 1;
        }
        
        /* Run the algorithm until we compute A(i, n), which happens when
         * positions[i] == n.
         */
        while (positions[i] != n) {        
            /* Push the value in the bottom row forward and increment it to simulate
             * computing the next value of A(0, *).
             */
            values[0]++;
            positions[0]++;
            
            /* Now, do the ripple. We continue rippling upward while
             *
             * 1. There's a layer above us to ripple to, and
             * 2. The value above us equals our current position.
             */
            for (int j = 1; j <= i && positions[j - 1] == values[j]; j++) {
                values[j] = values[j - 1]; // Copy the value
                positions[j]++;            // Shift forward a column
            }
        }
        
        return values[i];
    }

    public static void main(String[] args) {
        if (args.length != 2) {
            System.err.println("Usage: java Ackermann i n");
            return;
        }
        
        int i = Integer.parseInt(args[0]);
        int n = Integer.parseInt(args[1]);
        
        System.out.println("A(" + i + ", " + n + ") = " + ackermann(i, n));
    }
}

One last detail - I believe that Zeitman and Grossman's original runtime analysis of O(iA(i, n)) is correct but not tight. Specifically, this assumes that on each step of the algorithm we copy a value from one layer up to the next. However, that's not the case. The values in each row grow so quickly that on most steps we won't copy anything to any high row of the table. More specifically, note that Row 2 of the table consists of every other odd number, so only half the increment steps will copy something there, and the rows above that surely don't copy more than half the values of the layer below them. This means that, on average, each ripple will have a 100% chance of copying to Row 1, at most a 50% chance of copying to Row 2, at most a 25% chance of copying to Row 3, etc. Summing the number of copies made and using the fact that there's (at least) geometric decay here means that the "average" operation only copies to O(1) total rows. That means we're doing O(A(i, n)) increment steps, each of which (on average) copies to only O(1) total rows, so the total work done is O(A(i, n)). That's not a huge improvement on the runtime bound (the A(i, n) term is gigantic!), but it is a slightly better result.