c++runtimedynamic-programmingdecodingstring-decoding

Time optimize C++ function to find number of decoding possibilities


This is an interview practice problem from CodeFights. I have a solution that's working except for the fact that it takes too long to run for very large inputs.

Problem Description (from the link above)

A top secret message containing uppercase letters from 'A' to 'Z' has been encoded as numbers using the following mapping:

'A' -> 1
'B' -> 2
...
'Z' -> 26

You are an FBI agent and you need to determine the total number of ways that the message can be decoded.

Since the answer could be very large, take it modulo 10^9 + 7.

Example

For message = "123", the output should be

mapDecoding(message) = 3.

"123" can be decoded as "ABC" (1 2 3), "LC" (12 3) or "AW" (1 23), so the total number of ways is 3.

Input/Output

[time limit] 500ms (cpp)

[input] string message

A string containing only digits.

Guaranteed constraints:

0 ≤ message.length ≤ 105.

[output] integer

The total number of ways to decode the given message.

My Solution so far

We have to implement the solution in a function int mapDecoding(std::string message), so my entire solution is as follows:

/*0*/ void countValidPaths(int stIx, int endIx, std::string message, long *numPaths)
/*1*/ {
/*2*/    //check out-of-bounds error
/*3*/    if (endIx >= message.length())
/*4*/        return;
/*5*/    
/*6*/    int subNum = 0, curCharNum;
/*7*/    //convert substr to int
/*8*/    for (int i = stIx; i <= endIx; ++i)
/*9*/    {
/*10*/       curCharNum = message[i] - '0';
/*11*/       subNum = subNum * 10 + curCharNum;
/*12*/   }
/*13*/    
/*14*/   //check for leading 0 in two-digit number, which would not be valid
/*15*/   if (endIx > stIx && subNum < 10)
/*16*/       return;
/*17*/    
/*18*/   //if number is valid
/*19*/   if (subNum <= 26 && subNum >= 1)
/*20*/   {
/*21*/       //we've reached the end of the string with success, therefore return a 1
/*22*/       if (endIx == (message.length() - 1) )
/*23*/           ++(*numPaths);
/*24*/       //branch out into the next 1- and 2-digit combos
/*25*/       else if (endIx == stIx)
/*26*/       {
/*27*/           countValidPaths(stIx, endIx + 1, message, numPaths);
/*28*/           countValidPaths(stIx + 1, endIx + 1, message, numPaths);
/*29*/       }
/*30*/       //proceed to the next digit
/*31*/       else
/*32*/            countValidPaths(endIx + 1, endIx + 1, message, numPaths);
/*33*/   }
/*34*/ }    
/*35*/
/*36*/ int mapDecoding(std::string message) 
/*37*/ {
/*38*/    if (message == "")
/*39*/        return 1;
/*40*/    long numPaths = 0;
/*41*/    int modByThis = static_cast<int>(std::pow(10.0, 9.0) + 7);
/*42*/    countValidPaths(0, 0, message, &numPaths);
/*43*/    return static_cast<int> (numPaths % modByThis);
/*44*/ }

The Issue

I have passed 11/12 of CodeFight's initial test cases, e.g. mapDecoding("123") = 3 and mapDecoding("11115112112") = 104. However, the last test case has message = "1221112111122221211221221212212212111221222212122221222112122212121212221212122221211112212212211211", and my program takes too long to execute:

Expected_output:      782204094
My_program_output:    <empty due to timeout>

I wrote countValidPaths() as a recursive function, and its recursive calls are on lines 27, 28 and 32. I can see how such a large input would cause the code to take so long, but I'm racking my brain trying to figure out what more efficient solutions would cover all possible combinations.

Thus the million dollar question: what suggestions do you have to optimize my current program so that it runs in far less time?


Solution

  • A couple of suggestions. First this problem can probably be formulated as a Dynamic Programming problem. It has that smell to me. You are computing the same thing over and over again.

    The second is the insight that long contiguous sequences of "1"s and "2"s are a Fibonacci sequence in terms of the number of possibilities. Any other value terminates the sequence. So you can split the strings into runs of of ones and twos terminated by any other number. You will need special logic for a termination of zero since it does not also correspond to a character. So split the strings count, the length of each segment, look up the fibonacci number (which can be pre-computed) and multiply the values. So your example "11115112112" yields "11115" and "112112" and f(5) = 8 and f(6) = 13, 8*13 = 104.

    Your long string is a sequence of 1's and 2's that is 100 digits long. The following Java (Sorry, my C++ is rusty) program correctly computes its value by this method

    public class FibPaths {
    private static final int MAX_LEN = 105;
    private static final BigInteger MOD_CONST = new BigInteger("1000000007");
    private static BigInteger[] fibNum = new BigInteger[MAX_LEN];
    
    private static void computeFibNums() {
        fibNum[0] = new BigInteger("1");
        fibNum[1] = new BigInteger("1");
        for (int i = 2; i < MAX_LEN; i++) {
            fibNum[i] = fibNum[i-2].add(fibNum[i-1]);
        }
    }
    
    public static void main(String[] argv) {
        String x = "1221112111122221211221221212212212111221222212122221222112122212121212221212122221211112212212211211";
        computeFibNums();
        BigInteger val = fibNum[x.length()].mod(MOD_CONST);
        System.out.println("N=" + x.length() + " , val = " + val);
    }
    

    }