encodingcharacter-encodingscancodeslow-level-io

How does a program actually receives character input? From scan code to the final raw input bits


So, my question is simple: How does a program receive the raw input bits after user "inputs" (from physical keyboard or any other way) a character key?

I mean I know how character encoding works after it's been received by the program as raw bits but I'm not clear about how that bit sequence appears in the first place.

I've been reading a little but this turned out to be a tough search for my Google-fu. It seems like the OS receives a scan code from the input device (usually keyboard), maps this to an encoding using character mappings and keyboard layouts, and then passes the resulting bit sequence to program. Am I right? If so, the only missing part for me is this:

  1. How does keyboard layouts define what character a scan code corresponds to? Using Unicode code points? An OS specific internal table?

  2. Secondly, does a program defines in which character encoding it's expecting its input (from OS) at compile time? Does it at all?


Solution

  • There is a very good low-Level description on how to read from a keyboard as part of an assembler course. it covers the whole pipeline from the keystroke to the CPU. You can find the whole course here and the chapter about keyboard input here.