algorithmgpufpu

Is matrix multiplication speed faster for sparse matrixes than dense matrixes?


Is matrix multiplication speed faster for sparse matrixes than dense matrixes? To give a simplified example, does "[[0,0],[0,0]] multiplies [[1,1],[1,1]]" faster than "[[256,256],[256,256]] multiplies [[1,1],[1,1]]" ?


Solution

  • The machine code algorithm to do a multiplication goes like this:

    int mul(int a,int b)
     {
        int result = 0;
        bit sign = sign(a) ^ sign(b);
        a = abs(a); b = abs(b);
    
        while (b != 0)
         {
            b = b>>1; // shift b right, bit0 into carry
            if (carrySet()) result += a;
            a = a<<1; // shift a left
            // note: checks for overflow being left out
         }
        return (sign==0 ? sum : -sum);
     }
    

    You'll easily see that the more bits are set in the right operand, the more computations are necessary to sum up the left operand. So, provided that your matrix multiplication boils down to machine code multiplications like this, a sparse matrix will multiply significantly faster than a dense matrix.

    The question I cannot answer here is if the FPU will do this in a more efficient manner. You'll want to read some specs here. But even if a FPU (or GPU) is doing some sort of tweaking, I doubt the basic multiplication grinding loop looks very much different (interested in comments about this.)