pythonmatlabprobabilityrainbowtable

Converting MATLAB code to Python: Python types and order of operations


This is a MATLAB function from the author of RainbowCrack:

function ret = calc_success_probability(N, t, m)
arr = zeros(1, t - 1);
arr(1) = m;
for i = 2 : t - 1
    arr(i) = N * (1 - (1 - 1 / N) ^ arr(i - 1));
end

exp = 0;
for i = 1 : t - 1
    exp = exp + arr(i);
end

ret = 1 - (1 - 1 / N) ^ exp;

It calculates the probability of success in finding a plaintext password given a rainbow table with keyspace N, a large unsigned integer, chain of length t, and number of chains m.

A sample run:

calc_success_probability(80603140212, 2400, 40000000)

Returns 0.6055.

I am having difficulty converting this into Python. In Python 3, there is no max integer anymore, so N isn't an issue. I think in the calculations I have to force everything to a large floating point number, but I'm not sure.

I also don't know the order of operations in MATLAB. I think the code is saying this:

Create array of size [1 .. 10] so ten elements Initialize every element of that array with zero

In zero-based indexing, I think this would be array[0 .. t-1], it looks like MATLAB uses 1 as the first (0'th) index.

Then second element of array (0-based indexing) initialized to m.

For each element in array, pos=1 (0-based indexing) to t-1:

array[pos] = N * (1 - (1 - 1/N) ** array[pos-1]

Where ** is the power operator. I think power is ^ in MATLAB, so N * (1 - (1-1/N) to the array[pos-1] power is like that above.

Then set an exponent. For each element in array 0 to t-1: exponent is exponent + 1

return probability = 1 - (1 - 1/N) power of exp;

My Python code looks like this, and doesn't work. I can't figure out why, but it could be that I don't understand MATLAB enough, or Python, both, or I'm reading the math wrong somehow and what's going on in MATLAB is not what I'm expecting, i.e. I have order of operations and/or types wrong to make it work and I'm missing something in those terms...

def calc_success_probability(N, t, m):

    comp_arr = []

    # array with indices 1 to t-1 in MATLAB, which is otherwise 0 to t-2???
    # range with 0, t is 0 to t excluding t, so t here is t-1, t-1 is up
    # to including t-2... sounds wrong...
    for i in range(0, t-1):
        # initialize array
        comp_arr.append(0)

    print("t = {0:d}, array size is {1:d}".format(t, len(comp_arr)))

    # zero'th element chain count
    comp_arr[0] = m

    for i in range(1, t-1):
        comp_arr[i] = N * (1 - (1 - 1 / N)) ** comp_arr[i-1]

    final_exp = 0
    for i in range(0, t-1):
        final_exp = final_exp + comp_arr[i]

    probability = (1 - (1 - 1 / N)) ** final_exp

    return probability

Solution

  • Watch your brackets! You have translated this:

    arr(i)      = N * (   1 - ( 1 - 1 / N )     ^  arr(i - 1)     );
    

    to this:

    comp_arr[i] = N * (   1 - ( 1 - 1 / N )   ) ** comp_arr[i-1]
    

    I've lined up everything so you can better see where it goes wrong. You've moved a bracket to the wrong location.

    It should be:

    comp_arr[i] = N * (   1 - ( 1 - 1 / N )     ** comp_arr[i-1]  )
    

    Similarly,

    ret = 1 - (1 - 1 / N) ^ exp;
    

    is not the same as

    probability = (1 - (1 - 1 / N)) ** final_exp
    

    This should be

    probability = 1 - (1 - 1 / N) ** final_exp