cmathcatalan

C programming numbers and mountains


I created a C program that gives the nth catalan number.Everything is working so far. Here it is:

#include <stdio.h>

int catalan(int);

main()
{
    int number, catalannumber;

    printf("This is a program to find a given catalan number.\nIntroduce your desired number: ");
    scanf("%d", &number);

    while (number < 1)
    {
        printf("Number must be greater or equal to 1. Reintroduce: ");
        scanf("%d", &number);
    }

    catalannumber = catalan (number);

    printf("The %dth number corresponds to %d.\n", number, catalannumber);
}

int catalan(int n) {
  if (n == 1)
    return 1; 
  else
    return 2 * (2*n - 1) * catalan(n-1) / (n+1);
}

Then I found this "classical" mountain ranges problem, with all the possible mountain ranges pictured: http://www.maths.usyd.edu.au/u/kooc/catalan/cat3moun.pdf

Here's what the "mountain ranges" look like for small values of n enter image description here source

My goal was to be able to create a program that:

  1. once you give them the number, you get to choose the number of "mountain tops" ( <= number )

  2. the program would count how many different mountain ranges, (of the given number), have that amount of mountain tops.

By the pdf file, I know that:

  1. number n = 3 & "mountain tops" = 2 -> there are 3 (of the total 5) mountain ranges with 2 mountain tops (both different mountain "types").

  2. number n = 4 & "mountain tops" = 3 -> 6 different mountain ranges.

My question is, what is the best way to do this work? Is there any formula for it? I've thought about Pascal Triangle and Fibonacci series, but I didn't find any relation between. I wonder what could be a possible solution. Any help will be truly aprecciated. Thanks.


Solution

  • Let's look at the brute force approach first.

    Catalan number cn is the number of ways one can combine n upstrokes (╱) and n downstrokes (╲) without going below the horizon. By definition, cn = (n + 2)/2 · (n + 3)/3 · ... (n + n)/n.

    We can use unsigned 2n bit integers to describe each combination. (However, not all 2n bit unsigned integer values describe a valid combination.) If we consider a nonzero or set bit as an upstroke, and zero bit as a downstroke, a peak occurs whenever a downstroke follows an upstroke. (When a downstroke is followed by an upstroke, you have a valley.)

    (Note that not all unsigned 2n bit integer values describe a valid combination. To be valid, the very first bit must be an upstroke, and there must be exactly n upstrokes.)

    So, first write a function that calculates the number of peaks in one combination described by an unsigned integer. (Note that because each unsigned integer that describes a combination has exactly n bits set, we do not need to pass n explicitly, only the unsigned integer that describes the combination.) For example,

    unsigned int  peaks(unsigned int  description)
    {
        unsigned int  count = 0;
    
        while (description) {
            count += ((description & 3) == 1);
            description >>= 1;
        }
    
        return count;
    }
    

    Above, the combination is read starting from the least significant bit. (And because it must be set for the mountain range to possibly be above the horizon, no even value describes a combination.) The expression (description & 3) isolates the last two significant bits. The four possible cases correspond to double downslope, peak, valley, and double upslope, respectively, in order of increasing numerical value. The peak case corresponds to value 1 (01b in binary: an upstroke followed by a downstroke, reading the digits in order of increasing significance, right to left). In C, the value of a logical False is zero, and a logical True is 1, so in the above loop, we get the number of cases where a set bit is followed (in a more significant position) by a clear bit.

    Value  Mountains       n   Peaks
        1   ╱╲             1     1
        3   ╱╱╲╲           2     1
        5   ╱╲╱╲           2     2
        7   ╱╱╱╲╲╲         3     1
        9   ╱╲╲╱         Not a valid combination
       11   ╱╱╲╱╲╲         3     2
       13   ╱╲╱╱╲╲         3     2
       15   ╱╱╱╱╲╲╲╲       4     1
       17   ╱╲╲╲╱        Not a valid combination
       19   ╱╱╲╲╱╲         3     2
       21   ╱╲╱╲╱╲         3     3
       23   ╱╱╱╲╱╲╲╲       4     2
       25   ╱╲╲╱╱╲       Not a valid combination
       27   ╱╱╲╱╱╲╲╲       4     2
       29   ╱╲╱╱╱╲╲╲       4     2
       31   ╱╱╱╱╱╲╲╲╲╲     5     1
       33   ╱╲╲╲╲╱       Not a valid combination
       35   ╱╱╲╲╲╱       Not a valid combination
       37   ╱╲╱╲╲╱       Not a valid combination
       39   ╱╱╱╲╲╱╲╲       4     2
    

    Next, create a function that generates all unsigned integer values that describe a valid combination for a specific n, and count those that correspond to a specific number of peaks.

    One brute force way is to write the peak-counting function so that it returns 0 for all invalid combinations. For example:

    static unsigned int  peaks(unsigned int  description)
    {
        unsigned int  count = 0;
        int           height = 0;
    
        /* Description must start with an upstroke. */
        if (!(description & 1))
            return 0;
    
        while (description) {
            switch (description & 3) {
            case 0: /* Downslope; next is downslope. */
                if (--height < 0)
                    return 0;
                break;
    
            case 1: /* Upslope; next is downslope. */
                count++;
                height++;
                break;
    
            case 2: /* Downslope; next is upslope. */
                if (--height < 0)
                    return 0;
                break;
    
            default: /* 3: Upslope; next is upslope. */
                height++;
    
            }
    
            description >>= 1;
        }
    
        return count;
    }
    

    The n that a description corresponds to (if peak(description) > 0), is the number of bits set in the description. A nifty trick to count that is

    unsigned int  popcount(unsigned int  value)
    {
        unsigned int  count = 0;
        while (value) {
            value &= value - 1;
            count++;
        }
        return count;
    }
    

    With these two functions, you can solve the stated question for smallish n, by exploring all 2n-bit unsigned integers (from 0 to (1 << (2*n)) - 1, inclusive).


    For a better approach, let us inspect the number of peaks for each n:

    n Combs   Occurrences*Peaks
    0    1     1*0
    1    1     1*1
    2    2     1*1,  1*2
    3    5     1*1,  3*2,  1*3
    4   14     1*1,  6*2,  6*3,  1*4
    5   42     1*1, 10*2, 20*3, 10*4,  1*5
    6  132     1*1, 15*2, 50*3, 50*4, 15*5, 1*6
    

    In other words, n=6 has 132 valid combinations. Of these, there is one with one peak, 15 with two peaks, 50 with three peaks, 50 with four peaks, and one with six peaks.

    If we just form the integer sequences for the peak counts, we can express the above as

    1,
    1, 1,
    1, 3, 1
    1, 6, 6, 1,
    1, 10, 20, 10, 1,
    1, 15, 50, 50, 15, 1,
    

    and so on, continuing with 1, 21, 105, 175, 105, 21, 1 for n=7, and with 1, 28, 196, 490, 490, 196, 28, 1 for n=8, and so on.

    If we do an OEIS search for that sequence, we'll find out these are actually called Narayana numbers T(n,k), and the entire integer sequence is OEIS A001263.

    (Note: I didn't know this was so! All I knew I could use OEIS to find out if the sequence was known, and they usually are. In other words, I'm not just showing you the answer to this particular question here, but how I find -- quite effectively, if I may say so myself -- solutions to this kind of problem, starting from a brute force numerical approach.)

    So, the mathematical answer is that Narayana number T(n,k) tells you the number of different mountain ranges corresponding to Catalan number cn with exactly k peaks.

    If you implement the binomial coefficient as function binomial(n, k), then the answer is T(n, k) = binomial(n, k) * binomial(n, k - 1) / n.

    Note, however, that you can implement the T(n, k) more efficiently as a division between products of terms (i.e., using two arrays of terms, and eliminating common terms and products of terms using a greatest common divisor helper function, without precision issues due to term cancellation.)