javafontsawtjfreechartfontmetrics

Why does a Latin-characters-only Java font claim to support Asian characters, even though it does not?


When rendering a chart with JFreeChart, I noticed a layout problem when the chart's category labels included Japanese characters. Although the text is rendered with the correct glyphs, the text was positioned in the wrong location, presumably because the font metrics were wrong.

The chart was originally configured to use the Source Sans Pro Regular font for that text, which supports only Latin character sets. The obvious solution is to bundle an actual Japanese .TTF font and ask JFreeChart to use it. This works fine, in that the output text uses the correct glyphs and it is also laid out correctly.

My questions

Edited to clarify:

Details

I understand that java.awt.Font#TextLayout is smart, and that when trying to lay out text, it first asks the underlying fonts whether they can actually render the supplied characters. If not, it presumably swaps in a different font that knows how to render those characters, but this is not happening here, based on my debugging pretty far into the JRE classes. TextLayout#singleFont always returns a non-null value for the font and it proceeds through the fastInit() part of the constructor.

One very curious note is that the Source Sans Pro font somehow gets coerced into telling the caller that it does know how to render Japanese characters after the JRE performs a transformation on the font.

For example:

// We load our font here (download from the first link above in the question)

File fontFile = new File("/tmp/source-sans-pro.regular.ttf");
Font font = Font.createFont(Font.TRUETYPE_FONT, new FileInputStream(fontFile));
GraphicsEnvironment.getLocalGraphicsEnvironment().registerFont(font);

// Here is some Japanese text that we want to display
String str = "クローズ";

// Should say that the font cannot display any of these characters (return code = 0)

System.out.println("Font " + font.getName() + " can display up to: " + font.canDisplayUpTo(str));

// But after doing this magic manipulation, the font claims that it can display the
// entire string (return code = -1)

AttributedString as = new AttributedString(str, font.getAttributes());
Map<AttributedCharacterIterator.Attribute,Object> attributes = as.getIterator().getAttributes();
Font newFont = Font.getFont(attributes);

// Eeek, -1!    
System.out.println("Font " + newFont.getName() + " can display up to: " + newFont.canDisplayUpTo(str));

The output of this is:

Font Source Sans Pro can display up to: 0
Font Source Sans Pro can display up to: -1

Note that the three lines of "magic manipulation" mentioned above are not something of my own doing; we pass in the true source font object to JFreeChart, but it gets munged by the JRE when drawing the glyphs, which is what the three lines of "magic manipulation" code above replicates. The manipulation shown above is the functional equivalent of what happens in the following sequence of calls:

  1. org.jfree.text.TextUtilities#drawRotatedString
  2. sun.java2d.SunGraphics2D#drawString
  3. java.awt.font.TextLayout#(constructor)
  4. java.awt.font.TextLayout#singleFont

When we call Font.getFont() in the last line of the "magic" manipulation, we still get a Source Sans Pro font back, but the underlying font's font2D field is different than the original font, and this single font now claims that it knows how to render the entire string. Why? It appears that Java is giving us back some sort of "frankenfont" that knows how to render all kinds of glyphs, even though it only understands the metrics for the glyphs that are supplied in the underlying source font.

A more complete example showing the JFreeChart rendering example is here, based off one of the JFreeChart examples: https://gist.github.com/sdudley/b710fd384e495e7f1439 The output from this example is shown below.

Example with the Source Sans Pro font (laid out incorrectly):

enter image description here

Example with the IPA Japanese font (laid out correctly):

enter image description here


Solution

  • I finally figured it out. There were a number of underlying causes, which was further hindered by an added dose of cross-platform variability.

    JFreeChart Renders Text in the Wrong Location Because It Uses a Different Font Object

    The layout problem occurred because JFreeChart was inadvertently calculating the metrics for the layout using a different Font object than the one AWT actually uses to render the font. (For reference, JFreeChart's calculation happens in org.jfree.text#getTextBounds.)

    The reason for the different Font object is a result of the implicit "magic manipulation" mentioned in the question, which is performed inside of java.awt.font.TextLayout#singleFont.

    Those three lines of magic manipulation can be condensed to just this:

    font = Font.getFont(font.getAttributes())
    

    In English, this asks the font manager to give us a new Font object based on the "attributes" (name, family, point size, etc) of the supplied font. Under certain circumstances, the Font it gives back to you will be different from the Font you originally started with.

    To correct the metrics (and thus fix the layout), the fix is to run the one-liner above on your own Font object before setting the font in JFreeChart objects.

    After doing this, the layout worked fine for me, as did the Japanese characters. It should fix the layout for you too, although it may not show the Japanese characters correctly for you. Read below about native fonts to understand why.

    The Mac OS X Font Manager Prefers to Return Native Fonts Even If You Feed it a Physical TTF File

    The layout of the text was fixed by the above change...but why does this happen? Under what circumstances would the FontManager actually give us back a different type of Font object than the one we provided?

    There are many reasons, but at least on Mac OS X, the reason related to the problem is that the font manager seems to prefer to return native fonts whenever possible.

    In other words, if you create a new font from a physical TTF font named "Foobar" using Font.createFont, and then call Font.getFont() with attributes derived from your "Foobar" physical font...so long as OS X already has a Foobar font installed, the font manager will give you back a CFont object rather than the TrueTypeFont object you were expecting. This seems to hold true even if you register the font through GraphicsEnvironment.getLocalGraphicsEnvironment().registerFont.

    In my case, this threw a red herring into the investigation: I already had the "Source Sans" font installed on my Mac, which meant that I was getting different results from people who did not.

    Mac OS X Native Fonts Always Support Asian Characters

    The crux of the matter is that Mac OS X CFont objects always support Asian character sets. I am unclear of the exact mechanism that allows this, but I suspect that it's some sort of fallback font feature of OS X itself and not Java. In either case, a CFont always claims to (and is truly able to) render Asian characters with the correct glyphs.

    This makes clear the mechanism that allowed the original problem to occur:

    You Will Get Different Results Depending on Whether You Registered the Font and Whether You Have The Font Installed in Your OS

    If you call Font.getFont() with the attributes from a created TTF font, you will get one of three different results, depending on whether the font is registered and whether you have the same font installed natively:

    In hindsight, none of this is entirely surprising. Leading to:

    I Was Inadvertently Using the Wrong Font

    In the production app, I was creating a font, but I forgot to initially register it with the GraphicsEnvironment. If you haven't registered a font when you perform the magic manipulation above, Font.getFont() doesn't know how to retrieve it and you get a backup font instead. Oops.

    On Windows, Mac and Linux, this backup font generally seems to be Dialog, which is a logical (composite) font that supports Asian characters. At least in Java 7u72, the Dialog font defaults to the following fonts for Western alphabets:

    This mistake was actually a good thing for our Asian users, because it meant that their character sets rendered as expected with the logical font...although the Western users were not getting the character sets that we wanted.

    Since it had been rendering in the wrong fonts and we needed to fix the Japanese layout anyway, I decided that I would be better off trying to standardize on one single common font for future releases (and thus coming closer to trashgod's suggestions).

    Additionally, the app has font rendering quality requirements that may not always permit the use of certain fonts, so a reasonable decision seemed to be to try to configure the app to use Lucida Sans, which is the one physical font that is included by Oracle in all copies of Java. But...

    Lucida Sans Doesn't Play Well with Asian Characters on All Platforms

    The decision to try using Lucida Sans seemed reasonable...but I quickly found out that there are platform differences in how Lucida Sans is handled. On Linux and Windows, if you ask for a copy of the "Lucida Sans" font, you get a physical TrueTypeFont object. But that font doesn't support Asian characters.

    The same problem holds true on Mac OS X if you request "Lucida Sans"...but if you ask for the slightly different name "LucidaSans" (note the lack of space), then you get a CFont object that supports Lucida Sans as well as Asian characters, so you can have your cake and eat it too.

    On other platforms, requesting "LucidaSans" yields a copy of the standard Dialog font because there is no such font and Java is returning its default. On Linux, you are somewhat lucky here because Dialog actually defaults to Lucida Sans for Western text (and it also uses a decent fallback font for Asian characters).

    This gives us a path to get (almost) the same physical font on all platforms, and which also supports Asian characters, by requesting fonts with these names:

    I've pored over the fonts.properties on Windows and I could not find a font sequence that defaulted to Lucida Sans, so it looks like our Windows users will need to get stuck with Arial...but at least it's not that visually different from Lucida Sans, and the Windows font rendering quality is reasonable.

    Where Did Everything End Up?

    In sum, we're now pretty much just using platform fonts. (I am sure that @trashgod is having a good chuckle right now!) Both Mac and Linux servers get Lucida Sans, Windows gets Arial, the rendering quality is good, and everyone is happy!