unicodefonts

How does a Unicode character get mapped to a glyph in a font?


I am wondering, that each char in Unicode has a code point; what's the analogous term for a character in a font?

I never understood the part of the process when a decoded file needs to be mapped to font (or fonts, by some modern font substitution technology).

For example, when a text editor has decoded a file from its character encoding, and suppose there's Greek alpha α (U+03B1). What's the exact process this app chooses a particular glyph in a font? Most app has a preferred font. Let's say it's Courier. (And what happens in the case of a rare Unicode char likethe heart ♥ (U+2665), that's not in the default font? How does the app know the font doesn't contain that char?)

Does a font contain meta info about what symbols it has?

If 2 fonts both have the symbol alpha, do they necessarily share the same “code point”? Or is it dependent on the type of font such as Type1, Type3, TrueType, OpenType? ...

Thanks for any pointers or references.


Solution

  • A "Code Point" is universal, referring to an idealized symbol or part of a symbol. There are many different ways to encode each code point into bytes, e.g., UTF-8, UTF-16, UCS-2. And each font will map code points to glyphs differently. For example, one font might represent Α (\u0391, Greek capital alpha) and A (\u0041, Latin capital a) as the same glyph while another may not.

    TrueType fonts consist of many sections, most importantly for this question a table of symbols it can draw (glyphs) and a table mapping code points to those glyphs (cmap).

    The operating system uses the cmap table to convert characters into glyph indexes, substituting a default glyph, usually a box like , for any which have no matching entry.

    Unfortunately there are multiple ways to construct a font file and different encodings of the same mappings in those tables, so the actual process of doing the mapping, and doing it efficiently so that text drawing is fast, ends up being extremely complex, and not easily explained in detail.

    Apple's developer documentation has a pretty good section on the details of TrueType fonts:

    TrueType Reference Manual

    Specifically:

    Glyph table

    Character map

    I also recommend a Windows application called BabelMap, which gives you a lot of interesting information about fonts. Specifically look at Tools/Unicode Summary, Fonts/Font Analysis Utility, and Fonts/Font Information, where you can extract the entire glyph mapping table to the clipboard.