webgpuwgsl

Why the 2Sum algorithm in WGSL doesn't work?


I've been trying to create a version in WGSL of 2Sum for increasing floating point precision, but it doesn't work. I wonder if there is any kind of optimization in WGSL that prevents the correct computation of 2sum?

Code sample using Knuth's version:

struct float2 {
  x: f32,
  y: f32
}

fn twoSum(a: f32, b: f32) -> float2 {
  var x = a + b;
  var bv = x - a;
  var av = x - bv;
  var br = b - bv;
  var ar = a - av;
  return float2(x, ar+br);
}

twoSum(1e10, 1e-10).y should be 1e-10, but instead it is 0. Actually, I couldn't find any example with twoSum(...).y != 0 with the code above.

It works in other programming languages. For instance, in C++:

#include <iostream>
#include <limits.h>

int main()
{
    std::cout<<sizeof (float) * CHAR_BIT << "\n";
    float a = 1e10;
    float b = 1e-10;
    float x = a + b;
    float bv = x - a;
    float av = x - bv;
    float br = b - bv;
    float ar = a - av;
    std::cout<< ar + br;

    return 0;
}

It shows that the size of float is 32 bits, and prints 1e-10.

In Python:

def twoSum(a, b):
    x = a + b
    bv = x - a
    av = x - bv
    br = b - av
    ar = a - av
    return x, ar + br

twoSum(1e10, 1e-10)

It correctly returns 1e-10 in the second element. Python uses float64 for numbers, but that's fine since 1e-10 + 1e-10 is not representable in float64 as well.


Solution

  • This issue is still being discussed in the spec. For now, there's no way to tell WebGPU/WGSL that you prefer precision over speed.

    https://github.com/gpuweb/gpuweb/pull/2080

    https://github.com/gpuweb/gpuweb/issues/2076