openglopengl-3glulookat

Derivation of (glu)lookAt


I am trying to learn (modern) OpenGL and I am thoroughly confused about the various transformations...

The viewing matrix have me confused so I need some clarification.

Here's what I have understood about the (conventional) pipeline.

  1. Vertices are specified in world space, which are scaled, translated, rotated etc. to the required positions using the modelling matrix
  2. (Here's where I start to get confused) We can (optionally) position a virtual camera in the required location using a "lookAt" function (gluLookAt). I followed the derivation of the matrix here: http://www.youtube.com/watch?v=s9FhcvjM7Hk. I understood until the point, where the professor calculates the "look-at" vector. He says that the look-at vector = eye - center. Now here is where I begin to get lost. My first instinct is that the vector should be center - eye. Suppose the center vector is supplied as (0,0,0) and the eye vector is (0,0,5). To look at the object, the camera should point towards center - eye = (0,0,-5). However, the professor states that we want to move center - eye to the -z direction (what does that mean?). Therefore, eye - center will give the look at direction. I am confused about this. He further adds on that in OpenGL there is a camera at the origin looking at (0,0,-1). Now, this is I completely do not understand. I do understand that the viewing transformation is nothing but applying inverse transformation on the objects. I experimented a little bit and found that when I drew a triangle with a z-value of 1(and absolutely no modelview/projection transforms), it was still drawn on the screen. However, I wouldn't expect this to be so, since the camera is at the origin.

Now, to sum up...

Any explanations/pointers?


Solution

  • When you render a triangle, the vertices' coordinates are interpreted as follows:

    That's why your simple example renders a visible triangle at the far plane.

    Now let's come to the view transformation. The transformation will be constructed from four vectors. The image of (1, 0, 0), the image of (0, 1, 0), the image of (0, 0, 1) and a translation vector. However, since the view transformation is an inverse transformation, the resulting matrix has to be inverted.

    You are right that the view direction is center - eye. However, that is not what we need for the matrix. We need the image of (0, 0, 1). Usually, OpenGL programs use a right-handed coordinate system. In that system the camera looks into negative z-direction. So center - eye is actually the image of (0, 0, -1). The image of (0, 0, 1) is then just eye - center. That's what you need.

    With this definition you will also need an appropriate projection transformation. Otherwise you will only see things behind the camera (because that's where the z-coordinate is positive and, hence, have a positive depth value). The projection transformation is responsible for turning negative z-coordinates into positive depth values.