I've been trying to create a version in WGSL of 2Sum for increasing floating point precision, but it doesn't work. I wonder if there is any kind of optimization in WGSL that prevents the correct computation of 2sum?
Code sample using Knuth's version:
struct float2 {
x: f32,
y: f32
}
fn twoSum(a: f32, b: f32) -> float2 {
var x = a + b;
var bv = x - a;
var av = x - bv;
var br = b - bv;
var ar = a - av;
return float2(x, ar+br);
}
twoSum(1e10, 1e-10).y
should be 1e-10
, but instead it is 0
. Actually, I couldn't find any example with twoSum(...).y
!= 0
with the code above.
It works in other programming languages. For instance, in C++:
#include <iostream>
#include <limits.h>
int main()
{
std::cout<<sizeof (float) * CHAR_BIT << "\n";
float a = 1e10;
float b = 1e-10;
float x = a + b;
float bv = x - a;
float av = x - bv;
float br = b - bv;
float ar = a - av;
std::cout<< ar + br;
return 0;
}
It shows that the size of float is 32 bits, and prints 1e-10
.
In Python:
def twoSum(a, b):
x = a + b
bv = x - a
av = x - bv
br = b - av
ar = a - av
return x, ar + br
twoSum(1e10, 1e-10)
It correctly returns 1e-10
in the second element. Python uses float64 for numbers, but that's fine since 1e-10 + 1e-10
is not representable in float64 as well.
This issue is still being discussed in the spec. For now, there's no way to tell WebGPU/WGSL that you prefer precision over speed.