c++openglperspectivecameradepth-testing

When and how does openGL calculate F_depth(depth value)


Meaning at this point the projection has already been done. This article gives us the projection matrix used by OpenGL, and the factor that affect the z-coordinate of a point is the row:

[ 0 0 -(f+n)/(f-n) -2fn/(f-n) ]

Note, this matrix is computed to make the ‘pyramidal’ frustum to a unit cube. Meaning the z-coordinate has also been mapped to [0,1] after this matrix is applied.

Then, the author in the depth value precision chapter tells us: These z-values in view space can be any values between frustum’s near and far plane and we need some way to transform them to [0,1]. The question is why at this point, when we had already mapped it while applying the projection matrix.

Also, he says: a linear depth buffer like this: F_depth=z-near/(far-near) is never used, for correct projection properties a non-linear depth equation is used:

F_depth= (1/z- 1/near)/(1/far - 1/near)

But, as we have seen, z is mapped within range using:

[ 0 0 -(f+n)/(f-n) -2fn/(f-n) ]

Which appears to be linear.

All these contradicting statements are making me really confused on when is the depth for fragments calculated and compared,and what is the equation actually used to compute this. In my understanding nothing more for depth should be calculated after the OpenGL projection matrix is applied, but after reading this I’m really confused. Any clarifications?


Solution

  • At perspective projection the depth is not linear, because of the perspective divide.

    When a vertex coordinate is transformed by the projection matrix then the clip space coordinate is computed. The clip space coordinate is a Homogeneous coordinate. Now all the geometry which is not in clip space (in the Viewing frustum) is clipped. The clipping rule is:

    -w <=  x, y, z  <= w
    

    After that the normalized device space coordinate is computed by dividing the x, y, z components by the w component (Perspective divide). NDC are Cartesian coordinates and the normalized device space is a unique cube with the left, bottom, near of (-1, -1, -1) and right, top, far of (1, 1, 1). All the geometry in the cube is projected on the 2 dimensional viewport.

    Note, after the homogeneous vertex coordinate is multiplied by the perspective projection matrix (clip space) the z component is "linear" but it is not in range [-1, 1]. After clipping and perspective divide, the z coordinate is in range [-1, 1] (NDC), but it is no longer "linear".

    The depth buffer can store values in range [0, 1]. Hence the z component of the normalized device space has to be mapped from [-1.0, 1.0] to [0.0, 1.0].


    At Perspective Projection the projection matrix describes the mapping from 3D points in the world as they are seen from of a pinhole camera, to 2D points of the viewport.
    The eye space coordinates in the camera frustum (a truncated pyramid) are mapped to a cube (the normalized device coordinates).

    A perspective projection matrix can be defined by a frustum.
    The distances left, right, bottom and top, are the distances from the center of the view to the side faces of the frustum, on the near plane. near and far specify the distances to the near and far plane of the frustum.

    r = right, l = left, b = bottom, t = top, n = near, f = far
    
    x:    2*n/(r-l)      0              0                0
    y:    0              2*n/(t-b)      0                0
    z:    (r+l)/(r-l)    (t+b)/(t-b)    -(f+n)/(f-n)    -1
    t:    0              0              -2*f*n/(f-n)     0
    

    If the projection is symmetrical and the line of sight is the axis of symmetry of the frustum, the matrix can be simplified:

    a  = w / h
    ta = tan( fov_y / 2 );
    
    2 * n / (r-l) = 1 / (ta * a)
    2 * n / (t-b) = 1 / ta
    (r+l)/(r-l)   = 0
    (t+b)/(t-b)   = 0
    

    The symmetrically perspective projection matrix is:

    x:    1/(ta*a)  0      0              0
    y:    0         1/ta   0              0
    z:    0         0     -(f+n)/(f-n)   -1
    t:    0         0     -2*f*n/(f-n)    0
    

    See also

    What exactly are eye space coordinates?

    How to render depth linearly in modern OpenGL with gl_FragCoord.z in fragment shader?