I want to render Unicode characters in an application and I have a rough idea of how I can do that for standard latin characters with freetype. However for other languages that have different layouts and shaping I'm not sure how to go from a set of characters I get in a UTF-8 encoded string to:
Picking a suitable font to display the characters
Picking the right layout for the characters (LTR, RTL, TTB)
Is this data contained in the unicode characters themselves (I'm not sure how else applications like web browsers would figure out how to render text)?
For a given Unicode character, how can I determine points 1 and 2? Freetype has some great documentation and talks quite a bit about using different layouts, but I didn't see how you would go about extracting said information from the characters themselves.
I also took a quick look at Harfbuzz but couldn't really find any documentation. There's an example floating around that shows how to set up and use Harfbuzz to layout some languages with Freetype rendering the glyphs, but the example explicitly passes layout, font and language information to Harfbuzz.
What do you do when you don't know those things in advance?
This is for a mobile application, and ideally the libs/solutions used would have a permissive license.
The Unicode character code point only encodes the character
itself; it gives no information with regards to the font to use,
nor layout, nor in fact anything else. To get information
concerning layout, etc., Unicode provides a number of files,
such as UnicodeData.txt
, which you can download and use. As
for the fonts, each font should provide descriptor files of some
sort, with things like the width, height and depth of each
character; these files can also be used to determine whch
characters the font supports.