c++performancentl

Fast Loop with multiplication NTL


I'm programming using C++ and ntl library but using profile the code below is slow when the galois field $GF(2^q)$ have q>= 8.
How I will be able to accelerate this code without use parallel programming.

void modKeyGenPrs(list<mat_GF2E>& Prs, list<mat_GF2E> Lst, mat_GF2E L1, mat_GF2E L2)
{
    mat_GF2E L1_trans = transpose(L1);
    for (int i=0; i<m; i++)
    {
        mat_GF2E sum;
        sum.SetDims(n, n);
        list<mat_GF2E>::const_iterator i_lst;
        int j=0;
        for( i_lst = Lst.begin(); i_lst != Lst.end(); i_lst++)
        {
            sum = sum + (L2[i][j]*(L1_trans*(*i_lst)*L1));
            j = j + 1;
        }
        Prs.push_back(sum);
    }
}

Solution

  • I see two points:

    1. It looks like your lists have a constant size of n, so you can use a vector (as mentioned). This is also a good thing, as you can not run out of bound with the index j and the code gets more readable. But I dont think this will speed up the function much.
    2. You do a matrix multiplikation with 3 matrices and you do it n^2 times. But you get only n different products, so you can reuse this. This should looks as follows

    void modKeyGenPrs(list<mat_GF2E>& Prs, list<mat_GF2E> Lst, mat_GF2E L1, mat_GF2E L2)
    {
        // Precompute here
        mat_GF2E L1_trans = transpose(L1);
        Vec<mat_GF2E> prods;
        prods.SetLenght(n);
        for( i_lst = Lst.begin(); i_lst != Lst.end(); i_lst++)
        {
            prod[i_lst] = (L1_trans*(*i_lst)*L1);
        }
    
        // compute the rest here
        for (int i=0; i<m; i++)
        {
            mat_GF2E sum;
            sum.SetDims(n, n);
            list<mat_GF2E>::const_iterator i_lst;
            int j=0;
            for( i_lst = Lst.begin(); i_lst != Lst.end(); i_lst++)
            {
                sum = sum + (L2[i][j]*(prod[i_lst]);
                j = j + 1;
            }
            Prs.push_back(sum);
        }
    }
    

    Notice: I used the variable n, as you do in your code, but I dont know where it comes from. also the variable m.