c++openglmatrixprojectionglm-math# Screen Coordinates to World Coordinates

I want to convert from Screen coordinates to world coordinates in `OpenGL`

. I am using `glm`

for that purpose (also I am using `glfw`

)

This is my code:

```
static void mouse_callback(GLFWwindow* window, int button, int action, int mods)
{
if (button == GLFW_MOUSE_BUTTON_LEFT) {
if(GLFW_PRESS == action){
int height = 768, width =1024;
double xpos,ypos,zpos;
glfwGetCursorPos(window, &xpos, &ypos);
glReadPixels(xpos, ypos, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &zpos);
glm::mat4 m_projection = glm::perspective(glm::radians(45.0f), (float)(1024/768), 0.1f, 1000.0f);
glm::vec3 win(xpos,height - ypos, zpos);
glm::vec4 viewport(0.0f,0.0f,(float)width, (float)height);
glm::vec3 world = glm::unProject(win, mesh.getView() * mesh.getTransform(),m_projection,viewport);
std::cout << "screen " << xpos << " " << ypos << " " << zpos << std::endl;
std::cout << "world " << world.x << " " << world.y << " " << world.z << std::endl;
}
}
}
```

Now, I have 2 problem, the first is that the world vector that I get from `glm::unProject`

has a very small x, y and z. If i use this values to translate the mesh, the mesh suffers a small translate and doesn't follow the mouse pointer.

The second problem is, that as said in the glm docs (https://glm.g-truc.net/0.9.8/api/a00169.html#ga82a558de3ce42cbeed0f6ec292a4e1b3) the result is returned in object coordinates. So in order to convert screen to world coordinates I should use a transform matrix from one mesh, but what happens if a have many meshes and i want to convert from screen to world coordinates? what model matrix should I multuply by camera view matrix to form ModelView matrix?

Solution

There are a couple of issues with this sequence:

`glfwGetCursorPos(window, &xpos, &ypos); glReadPixels(xpos, ypos, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &zpos); [...] glm::vec3 win(xpos,height - ypos, zpos);`

- Window space origin.
`glReadPixels`

is a GL function, and as such adheres to GL's conventions, with the origin beeing the lower left pixel. While you flip to that convention for your`win`

variable, you do still use the wrong origin for reading the depth buffer.

Furthermore, your flipping is wrong. Since `ypos`

should be in `[0,height-1]`

, the correct formula is `height-1 - ypos`

, so you are also *off by one* here. (We will see later that that isn't exactly true either.)

- "Screen Coordinates" vs. Pixel Coordinates. Your code assumes that the coordinates you get back from GLFW are in pixels. This is not the case. GLFW uses the concept of "virtual screen coordinates" which don't necessarily map to pixels:

Pixels and screen coordinates may map 1:1 on your machine, but they won't on every other machine, for example on a Mac with a Retina display. The ratio between screen coordinates and pixels may also change at run-time depending on which monitor the window is currently considered to be on.

GLFW generally provides two sizes for a window, `glfwGetWindowSize`

will return the result in said virtual screen coordinates, while `glfwGetFramebufferSize`

will return the actual size in pixels, relevant for OpenGL. So basically, you must query both sizes, and than can appropriately scale the mouse coords from screen coords to the actual pixels you need.

- Sub-Pixel position. While
`glReadPixels`

addresses a specific pixel with integer coordinates, the whole transformation math works with floating point and can represent arbitrary sub-pixel positions. GL's window space is defined so that integer coordinates represent the*corners*of the pixels, the pixel centers lie at half integer coordinates. Your`win`

variable will represent the lower left corner of said pixel, but the more useful convention would be to use the pixel center, so you'd better add an offset of`(0.5f, 0.5f, 0.0f)`

to`win`

, assuming you point to the pixel center. (We can do a bit better if the virtual screen coords are higher resolution than our pixels, which means we already get a sub-pixel position for the mouse cursor, but the math won't change, because we have still to switch to the GL's convent where integer means border instead of integer means center). Note that since we now consider a space which is going from`[0,w)`

in`x`

and`[0,h)`

in`y`

, this also affects point 1. If you click at pixel`(0,0)`

, it will have the center`(0.5, 0.5)`

, and the`y`

flipping should be`h-y`

so`h-0.5`

(which should be rounded down towards`h-1`

when accessing the framebuffer pixel).

To put it all together, you could do (conceptually):

```
glfwGetWindowSize(win, &screen_w, &screen_h); // better use the callback and cache the values
glfwGetFramebufferSize(win, &pixel_w, &pixel_h); // better use the callback and cache the values
glfwGetCursorPos(window, &xpos, &ypos);
glm::vec2 screen_pos=glm::vec2(xpos, ypos);
glm::vec2 pixel_pos=screen_pos * glm::vec2(pixel_w, pixel_h) / glm::vec2(screen_w, screen_h); // note: not necessarily integer
pixel_pos = pixel_pos + glm::vec2(0.5f, 0.5f); // shift to GL's center convention
glm::vec3 win=glm::vec3(pixel_pos.x, pixel_h-pixel_pos.y, 0.0f);
glReadPixels( (GLint)win.x, (GLint)win.y, ..., &win.z)
// ... unproject win
```

what model matrix should I multuply by camera view matrix to form ModelView matrix?

None. The basic coordinate transformation pipeline is

```
object space -> {MODEL} -> World Space -> {VIEW} -> Eye Space -> {PROJ} -> Clip Space -> {perspective divide} -> NDC -> {Viewport/DepthRange} -> Window Space
```

There is no model matrix influencing the way from world to window space, hence inverting it will also not depend on any model matrix either.

that as said in the glm docs (https://glm.g-truc.net/0.9.8/api/a00169.html#ga82a558de3ce42cbeed0f6ec292a4e1b3) the result is returned in object coordinates.

The math doesn't care about which spaces you transform between. The documentation mentions object space, and the function uses an argument named `modelView`

, but what matrix you put there is totally irrelevant. Putting just `view`

there will be fine.

So in order to convert screen to world coordinates I should use a transform matrix from one mesh.

Well, you could even do that. You could use any model matrix of any object, as long as the matrix isn't singular, and as long as you use the same matrix for the unproject as you later use for going from object space to world space. You can even make up a random matrix, if you make sure it is regular. (Well, there might be numerical issues if the matrix is ill-conditioned). The key thing here is that when you specify (V*M) and P as the matrices for `glm::unproject`

, it will internally calculate `(V*M)^-1 * P^-1 * ndc_pos`

which is `M^-1 * V^-1 & P^-1 * ndc_pos`

. If you transform the result back from object space to world space, you multiply that by `M`

again, resulting in `M * M^-1 * V^-1 & P^-1 * ndc_pos`

, which is of course just `V^-1 & P^-1 * ndc_pos`

which you would directly have gotten if you didn't put `M`

into the unproject in the first place. You just added more computational work, and introduced more potential for numerical issues...

- Segmentation fault when converting char * to char **
- What's the difference between "C system calls" and "C library routines"?
- C standard - const qualification of struct members
- Conventions/good ways to initialize variables with default/NULL values
- why there are int56_t in clang c header file, and how to use them
- How to watch char values of strings pointed by a pointer to pointers in debugger mode of VSCode
- How to repeat a char using printf?
- Is left shifting a long long considered Undefined Behavior in C?
- String buffer gets skipped on the last loop iteration
- Unsigned int weird behavior with Loops in C
- How are Linked Lists implemented in the Ready Queue before moving onto the scheduler during the PCB implementation?
- How do C/C++/Objective-C compare with C# when it comes to using libraries?
- Why no switch on pointers?
- Similar syntax causes strange, repeated compile errors when building PHP from source on Windows
- Segmentation fault in C shellcode x64
- Where is the documentation for the XRandr header?
- i*=j ==x and i=i*j ==x behaving differently in C
- How can I make objcopy -Obinary append every text section?
- timespec equivalent for windows
- Register keyword: stack or heap
- Detect a parenthesized macro argument in C
- Modifying (stealing) Linux syscalls using kprobe
- Python's bz2 module not compiled by default
- ASCII file encoding with 16-bit bytes
- GtkContainer/GtkWidget maximum width
- AF-XDP: Implement Shared Umem sockets
- Are there known implementations of the CIEDE2000 or CIE94 Delta-E color difference calculation algorithm?
- ssh_channel_request_exec is not working without ssh_channel_read
- Are DHCP options ordered?
- Can't install apk using adb