I'm having performance issues when trying to check whether integer n
is a perfect square (sqrt is a whole number) when using BigInt
. Using normal numbers below Number.MAX_SAFE_INTEGER
gives reasonable performance, but attempting to use BigInt
even with the same number range causes a huge performance hit.
The program solves the Battle of Hastings perfect square riddle put forth by Sam Loyd whereby my program iterates over the set of real numbers n
(in this example, up to 7,000,000) to find instances where variable y
is a whole number (perfect square). I'm interested in the original square root of one of the 13 perfect squares where this condition is satisfied, which is what my code generates (there's more than one).
Assuming y^2 < Number.MAX_SAFE_INTEGER
which is 2^53 – 1
, this can be done without BigInt
and runs in ~60ms on my machine:
const limit = 7_000_000;
var a = [];
console.time('regular int');
for (let n = 1; n < limit; n++) {
if (Math.sqrt(Math.pow(n, 2) * 13 + 1) % 1 === 0)
a.push(n);
}
console.log(a.join(', '));
console.timeEnd('regular int');
Being able to use BigInt
would mean I could test for numbers much higher than the inherent number variable limit 2^53 - 1
, but BigInt
seems inherently slower; unusably so. To test whether a BigInt
is a perfect square, I have to use a third party library as Math.sqrt
doesn't exist for BigInt
such that I can check if the root is perfect, as all sqrt
returns a floor value. I adapted functions for this from a NodeJS library, bigint-isqrt and bigint-is-perfect-square.
Thus, using BigInt
with the same limit of 7,000,000 runs 35x slower:
var integerSQRT = function(value) {
if (value < 2n)
return value;
if (value < 16n)
return BigInt(Math.sqrt(Number(value)) | 0);
let x0, x1;
if (value < 4503599627370496n)
x1 = BigInt(Math.sqrt(Number(value))|0) - 3n;
else {
let vlen = value.toString().length;
if (!(vlen & 1))
x1 = 10n ** (BigInt(vlen / 2));
else
x1 = 4n * 10n ** (BigInt((vlen / 2) | 0));
}
do {
x0 = x1;
x1 = ((value / x0) + x0) >> 1n;
} while ((x0 !== x1 && x0 !== (x1 - 1n)));
return x0;
}
function perfectSquare(n) {
// Divide n by 4 while divisible
while ((n & 3n) === 0n && n !== 0n) {
n >>= 2n;
}
// So, for now n is not divisible by 2
// The only possible residual modulo 8 for such n is 1
if ((n & 7n) !== 1n)
return false;
return n === integerSQRT(n) ** 2n;
}
const limit = 7_000_000;
var a = [];
console.time('big int');
for (let n = 1n; n < limit; n++) {
if (perfectSquare(((n ** 2n) * 13n) + 1n))
a.push(n);
}
console.log(a.join(', '));
console.timeEnd('big int');
Ideally I'm interested in doing this with a much higher limit than 7 million, but I'm unsure whether I can optimise the BigInt
version any further. Any suggestions?
You may be pleased to learn that some recent improvements on V8 have sped up the BigInt version quite a bit; with a recent V8 build I'm seeing your BigInt version being about 12x slower than the Number version.
A remaining challenge is that implementations of BigInt-sqrt
are typically based on Newton iteration and hence need an estimate for a starting value, which should be near the final result, so about half as wide as the input, which is given by log2(X)
or bitLength(X)
. Until this proposal gets anywhere, that can best be done by converting the BigInt to a string and taking that string's length, which is fairly expensive.
To get faster right now, @Ouroborus' idea is great. I was curious how fast it would be, so I implemented it:
(function betterAlgorithm() {
const limit = 7_000_000n;
var a = [];
console.time('better algorithm');
let m = 1n;
let m_squared = 1n;
for (let n = 1n; n < limit; n += 1n) {
let y_squared = n * n * 13n + 1n;
while (y_squared > m_squared) {
m += 1n;
m_squared = m * m;
}
if (y_squared === m_squared) {
a.push(n);
}
}
console.log(a.join(', '));
console.timeEnd('better algorithm');
})();
As a particular short-term detail, this uses += 1n
instead of ++
, because as of today, V8 hasn't yet gotten around to optimizing ++
for BigInts. This difference should disappear eventually (hopefully soon).
On my machine, this version takes only about 4x as much time as your original Number-based implementation.
For larger numbers, I would expect some gains from replacing the multiplications with additions (based on the observation that the delta between consecutive square numbers grows linearly), but for small-ish upper limits that appears to be a bit slower. If you want to toy around with it, this snippet describes the idea:
let m_squared = 1n; // == 1*1
let m_squared_delta = 3n; // == 2*2 - 1*1
let y_squared = 14n; // == 1*1*13+1
let y_squared_delta = 39n; // == 2*2*13+1 - 1*1*13+1
for (let n = 1; n < limit; n++) {
while (y_squared > m_squared) {
m_squared += m_squared_delta;
m_squared_delta += 2n;
}
if (y_squared === m_squared) {
a.push(n);
}
y_squared += y_squared_delta;
y_squared_delta += 26n;
}
The earliest where this could possibly pay off is when the results exceed 2n**64n
; I wouldn't be surprised if it wasn't measurable before 2n**256n
or so.