javascript3dwebglprojectiongl-matrix

How do I get 3D coordinates of clicked point and in fragment shader in WebGL?


I draw a unit sphere of objects (stars, sprites etc), which all have 3D coordinates on the "surface" of that sphere. The camera is at the center of it and I use perspective projection to show the resulting sky chart. Here's a view from the center of the sphere (how my app will be shown to a user): enter image description here

And from the side (for you to better understand what I mean by sphere of objects): enter image description here

I have read numerous articles on unprojections and ray casting, but they eventually

Since I have no models and no camera movement other than rotations, I have super simple matrix code:

const projectionMatrix = mat4.create()

mat4.perspective(projectionMatrix,
  degreesToRad(fov),
  viewport.x / viewport.y,
  0,
  100)

const panningMatrix = mat4.create()
// in place of this comment, rotations are applied to it as the user pans the map
mat4.invert(panningMatrix, panningMatrix)

const groundViewMatrix = mat4.create()
mat4.multiply(groundViewMatrix, panningMatrix, groundViewMatrix)

// some-shader.vert; u_modelViewMatrix is groundViewMatrix in this case
gl_Position = u_projectionMatrix * u_modelViewMatrix * a_position;

So, how do I get the 3D coordinates on this unit sphere:

  1. on JavaScript side for a point I click on (so that I can transform them to RA/Dec and display to the user)?
  2. in a fragment shader for the currently drawn pixel (I want to draw quad to shade the sky in a realistic way, for this I want to calculate pixel's angular distance from the Sun, pixel's height above the horizon etc, all in the shader code)?

Solution

  • To approach this we want to reverse the transform chain: OpenGL Transformation Pipeline

    So the steps are:

    Transform pixel coordinates back to NDC (normalized device coordinates)

    Normalized device coordinate space is a 3d unit cube, so to unproject the screen pixel coordinates into that cube we can simply do this:

    ndcX = (screenX / screenWidth) * 2 - 1
    ndcY = (screenY / screenHeight) * 2 - 1
    ndcZ = ???
    

    in a matrix this would be expressed as:

    reverse viewport transform matrix

    Note: WebGL Y pixel coordinates are inversed compared to the browsers, meaning if you're using the ClickEvents clientY/offsetY property you want to reverse it.

    Unproject NDC coordinates to camera space coordinates

    With the previous step we calculated the NDC coordinates on a 2D plane with unknown depth within the NDC space. To take it back to 3D we need to realize that a 2D point is the collapsed representation (a projection) of a ray in 3D space, for which we need at least two points on the ray, a line segment, to represent it. For the commonly employed projection matrices (like the ones generated via glMatrix'es perspective and ortho) we can simply choose the extents of the volume, giving us two coordinate sets:

    ndc1 = [ndcX, ndcY, -1]
    ndc2 = [ndcX, ndcY,  1]
    

    Transforming those using the inverse of the projection matrix will give us the points on the near and far clipping plane in camera space.

    near = ndc1 * inverseProjectionMatrix
    far = ndc2 * inverseProjectionMatrix
    

    Note: You want to use gl-matrix'es vec3.transformMat4 here

    Go from camera space to world space

    To undo the camera transforms we simply transform our two points using the inverse view matrix.

    worldNear = near * inverseViewMatrix
    worldFar = far * inverseViewMatrix
    

    Note: Again, you want to use gl-matrix'es vec3.transformMat4 here

    Go from line segment to ray

    To get the point on the unit sphere we just convert our line segment to a ray representation, which is commonly expressed as a ray origin and a normalized ray direction, the latter being the point on the unit sphere you're looking for. To do so we just subtract near from far and normalize the resulting vector.

    rayOrigin = near
    rayDirection = normalize(far-near)
    

    Note that, in case you specify your camera position in world space, or you don't translate the scene in any way (camera position being 0,0,0) you only need to do all of this for far, as rayOrigin will simply be cameraPosition.