javascriptoptimizationintegerbigintsqrt

Programmatically solving Sam Loyd's Battle of Hastings puzzle - performance issues with BigInt


I'm having performance issues when trying to check whether integer n is a perfect square (sqrt is a whole number) when using BigInt. Using normal numbers below Number.MAX_SAFE_INTEGER gives reasonable performance, but attempting to use BigInt even with the same number range causes a huge performance hit.

The program solves the Battle of Hastings perfect square riddle put forth by Sam Loyd whereby my program iterates over the set of real numbers n (in this example, up to 7,000,000) to find instances where variable y enter image description here is a whole number (perfect square). I'm interested in the original square root of one of the 13 perfect squares where this condition is satisfied, which is what my code generates (there's more than one).

Assuming y^2 < Number.MAX_SAFE_INTEGER which is 2^53 – 1, this can be done without BigInt and runs in ~60ms on my machine:

const limit = 7_000_000;
var a = [];

console.time('regular int');

for (let n = 1; n < limit; n++) {
   if (Math.sqrt(Math.pow(n, 2) * 13 + 1) % 1 === 0)
      a.push(n);
}
console.log(a.join(', '));

console.timeEnd('regular int');

Being able to use BigInt would mean I could test for numbers much higher than the inherent number variable limit 2^53 - 1, but BigInt seems inherently slower; unusably so. To test whether a BigInt is a perfect square, I have to use a third party library as Math.sqrt doesn't exist for BigInt such that I can check if the root is perfect, as all sqrt returns a floor value. I adapted functions for this from a NodeJS library, bigint-isqrt and bigint-is-perfect-square.

Thus, using BigInt with the same limit of 7,000,000 runs 35x slower:

var integerSQRT = function(value) {
    if (value < 2n)
        return value;
   
    if (value < 16n)
        return BigInt(Math.sqrt(Number(value)) | 0);
    
    let x0, x1;
    if (value < 4503599627370496n)
        x1 = BigInt(Math.sqrt(Number(value))|0) - 3n;
    else {
        let vlen = value.toString().length;
        if (!(vlen & 1))
            x1 = 10n ** (BigInt(vlen / 2));
        else
            x1 = 4n * 10n ** (BigInt((vlen / 2) | 0));
    }
   
    do {
        x0 = x1;
        x1 = ((value / x0) + x0) >> 1n;
    } while ((x0 !== x1 && x0 !== (x1 - 1n)));
    return x0;
}

function perfectSquare(n) {
    // Divide n by 4 while divisible
    while ((n & 3n) === 0n && n !== 0n) {
        n >>= 2n;
    }
    // So, for now n is not divisible by 2
    // The only possible residual modulo 8 for such n is 1
    if ((n & 7n) !== 1n)
        return false;
    return n === integerSQRT(n) ** 2n;
}

const limit = 7_000_000;
var a = [];

console.time('big int');

for (let n = 1n; n < limit; n++) {
   if (perfectSquare(((n ** 2n) * 13n) + 1n))
      a.push(n);
}
console.log(a.join(', '));

console.timeEnd('big int');

Ideally I'm interested in doing this with a much higher limit than 7 million, but I'm unsure whether I can optimise the BigInt version any further. Any suggestions?


Solution

  • You may be pleased to learn that some recent improvements on V8 have sped up the BigInt version quite a bit; with a recent V8 build I'm seeing your BigInt version being about 12x slower than the Number version.

    A remaining challenge is that implementations of BigInt-sqrt are typically based on Newton iteration and hence need an estimate for a starting value, which should be near the final result, so about half as wide as the input, which is given by log2(X) or bitLength(X). Until this proposal gets anywhere, that can best be done by converting the BigInt to a string and taking that string's length, which is fairly expensive.

    To get faster right now, @Ouroborus' idea is great. I was curious how fast it would be, so I implemented it:

    (function betterAlgorithm() {
      const limit = 7_000_000n;
      var a = [];
    
      console.time('better algorithm');
    
      let m = 1n;
      let m_squared = 1n;
      for (let n = 1n; n < limit; n += 1n) {
        let y_squared = n * n * 13n + 1n;
        while (y_squared > m_squared) {
          m += 1n;
          m_squared = m * m;
        }
        if (y_squared === m_squared) {
          a.push(n);
        }
      }
      console.log(a.join(', '));
    
      console.timeEnd('better algorithm');
    })();
    

    As a particular short-term detail, this uses += 1n instead of ++, because as of today, V8 hasn't yet gotten around to optimizing ++ for BigInts. This difference should disappear eventually (hopefully soon).

    On my machine, this version takes only about 4x as much time as your original Number-based implementation.

    For larger numbers, I would expect some gains from replacing the multiplications with additions (based on the observation that the delta between consecutive square numbers grows linearly), but for small-ish upper limits that appears to be a bit slower. If you want to toy around with it, this snippet describes the idea:

      let m_squared = 1n;         // == 1*1
      let m_squared_delta = 3n;   // == 2*2 - 1*1
      let y_squared = 14n;        // == 1*1*13+1
      let y_squared_delta = 39n;  // == 2*2*13+1 - 1*1*13+1
      for (let n = 1; n < limit; n++) {
        while (y_squared > m_squared) {
          m_squared += m_squared_delta;
          m_squared_delta += 2n;
        }
        if (y_squared === m_squared) {
          a.push(n);
        }
        y_squared += y_squared_delta;
        y_squared_delta += 26n;
      }
    

    The earliest where this could possibly pay off is when the results exceed 2n**64n; I wouldn't be surprised if it wasn't measurable before 2n**256n or so.