javajavafxwebviewlang

The problem with Georgian language in javafx.scene.web.WebView


I have a problem with Georgian language in javafx.scene.web.WebView. It is displayed as squares. Here is an example of code:

    public void start(Stage stage) throws IOException {
        FXMLLoader fxmlLoader = new FXMLLoader(HelloApplication.class.getResource("hello-view.fxml"));
        Scene scene = new Scene(fxmlLoader.load(), 320, 240);
        WebView webView = (WebView) scene.lookup("#web");
        WebEngine engine = webView.getEngine();
        String html = """
            გამარჯობა, ეს არის ტესტი.
            """;
        engine.loadContent(html);


        stage.setTitle("Hello!");
        stage.setScene(scene);
        stage.show();
    }

and result:

enter image description here

I use

       <dependency>
            <groupId>org.openjfx</groupId>
            <artifactId>javafx-web</artifactId>
            <version>23.0.2</version>
        </dependency>

Solution

  • Reproducing Problem

    Skip to the end of the answer if you want to see a solution/workaround to the problem.

    I can reproduce the problem using Java 24.0.1 and JavaFX 24.0.1 on Windows 10 with the following code:

    package com.example;
    
    import javafx.application.Application;
    import javafx.scene.Scene;
    import javafx.scene.web.WebView;
    import javafx.stage.Stage;
    
    public class Main extends Application {
    
      private static final String HTML_PAGE =
          """
              <!DOCTYPE html>
              
              <html>
                <head>
                  <title>Unicode Rendering Test</title>
                  <meta charset="UTF-8"/>
                </head>
                <body>
                  <h1>Welcome</h1>
                  <p>გამარჯობა, ეს არის ტესტი.</p>
                </body>
              </html>
              """;
    
      @Override
      public void start(Stage primaryStage) {
        var webView = new WebView();
        webView.getEngine().loadContent(HTML_PAGE);
    
        primaryStage.setScene(new Scene(webView, 500, 300));
        primaryStage.show();
      }
    
      public static void main(String[] args) {
        launch(Main.class);
      }
    }
    

    Using a well-formed HTML document and specifying the encoding helps isolate the problem. It now seems pretty clear that JavaFX's WebView is failing to use a font that can display Georgian characters.

    And here's the Maven POM I used to run the example:

    <?xml version="1.0" encoding="UTF-8"?>
    
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>com.example</groupId>
        <artifactId>example</artifactId>
        <version>0.1.0</version>
    
        <properties>
            <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
            <maven.compiler.release>24</maven.compiler.release>
            <javafx.version>24.0.1</javafx.version>
        </properties>
    
        <build>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <version>3.14.0</version>
                </plugin>
    
                <plugin>
                    <groupId>org.openjfx</groupId>
                    <artifactId>javafx-maven-plugin</artifactId>
                    <version>0.0.8</version>
                    <configuration>
                        <mainClass>com.example.Main</mainClass>
                        <options>
                            <option>--enable-native-access=javafx.graphics,javafx.media,javafx.web</option>
                            <option>--sun-misc-unsafe-memory-access=allow</option>
                        </options>
                    </configuration>
                </plugin>
            </plugins>
        </build>
    
        <dependencies>
            <dependency>
                <groupId>org.openjfx</groupId>
                <artifactId>javafx-web</artifactId>
                <version>${javafx.version}</version>
            </dependency>
        </dependencies>
    </project>
    

    Output of mvn javafx:run:

    Screenshot of application demonstrating problem.

    Other Browsers

    Loading the HTML in the following browsers (on Windows 10) displays the Georgian characters correctly:

    The answer by Basil Bourque shows it working on macOS with Safari. This is particularly interesting because Safari's web engine is WebKit which is the same web engine used by JavaFX (I don't know if the two implementations are exactly the same).

    A Bug

    A noted by Basil Bourque and VGR in the comments (and Basil's answer), this seems to be a bug with JavaFX's WebView. At the very least, this is a questionable implementation choice by JavaFX.

    Not a “simple font issue”. Selection of a fallback font for those characters should be automatic. HTML content is not responsible for finding fonts to render the contained characters. Adding a font name within your HTML content is a workaround for a bug.

    The HTML/CSS author can guide the runtime through a list of preferred fonts (a “font stack”) that may or may not exist on the runtime machine. If none of those suggested fonts provide a glyph for a particular character, the web browser and host OS should find some other font with the needed glyphs. Only if there is no font at all available at runtime offering those glyphs should we see boxes/questions/blanks.

    – Basil Bourque

    It’s pretty clear that this has nothing to do with encodings. Basil is mostly correct in that WebKit is not doing its due diligence; its default stylesheet should search all available fonts in order to maximize Unicode coverage. I don’t think the HTML or CSS standards require doing that, so I don’t know if it’s a bug as much as a poor renderer design. Specifying font-family is a workaround which should not be necessary.

    – VGR

    There's also this old bug: JDK-8087702 – Georgian font is not rendered. That bug report says it occurs on Windows 8 and was created back in 2014, which means this has been a problem since JavaFX 8.

    I don't know if the bug is in JavaFX itself or the underlying WebKit engine.

    Platform specific

    This erroneous behavior would appear to be platform-specific. The Georgian characters are not displayed correctly by JavaFX's WebView on Windows, but Basil's answer shows them being displayed correctly on macOS using essentially the same code.

    Font Fallback

    As I understand it, the web engine should automatically pick a font that can display the characters on a web page. And this should happen for each individual character. At least, this seems to be the case when using font-family. From the documentation at https://developer.mozilla.org/en-US/docs/Web/CSS/font-family :

    Values are separated by commas to indicate that they are alternatives. The browser will select the first font in the list that is installed or that can be downloaded using a @font-face at-rule.

    [...]

    You should always include at least one generic family name in a font-family list, since there's no guarantee that any given font is available. This lets the browser select an acceptable fallback font when necessary.

    The font-family property specifies a list of fonts, from highest priority to lowest. Font selection does not stop at the first font in the list that is on the user's system. Rather, font selection is done one character at a time, so that if an available font does not have a glyph for a needed character, the latter fonts are tried. When a font is only available in some styles, variants, or sizes, those properties may also influence which font family is chosen.

    But I could not find much information (which doesn't mean it doesn't exist) about what happens if no font-family is defined by the webpage. From what I could gather, either web browsers should have a default stylesheet that defines font-family or web browsers will typically fall back to a user-defined or system-wide default font (whatever font that may be). Maybe it's a combination of the two. Either way, it may be that the default font chosen by JavaFX's WebView does not have glyphs for Georgian characters.

    However, as noted by Basil, modern operating systems like Windows, macOS, and Linux should have their own font fallback mechanism. For Windows, this would seem to be provided by DirectWrite.

    The DirectWrite font system provides services for dealing with font enumeration, font fallback [emphasis added], and font caching, which are all needed by applications for handling fonts.

    And JavaFX's WebView should be hooking into this mechanism when no font is specifically applied by the webpage, or if only a font family is specified. One thing to note is that even if you have:

    font-family: sans-serif;
    

    Which defines a font family, JavaFX's WebView still fails to pick an appropriate font to display each character.

    There seems to be a few bugs and enhancement requests about font fallback in JavaFX which may be related to the problem with WebView:

    The linked bugs seem more about javafx.scene.text.Font, but the issues may be bleeding over into WebView.


    Solution (Workaround)

    One solution is to explicitly define a font that can display Georgian characters. This was suggested in a comment by VGR.

    Does it work if you change the HTML string to this? <span style="font-family: 'Segoe UI';">გამარჯობა, ეს არის ტესტი.</span>

    You can implement this inline like in VGR's comment, in an embedded stylesheet, or in an external stylesheet. Here's an example using an embedded stylesheet (note the only change from code shown earlier is the <style> element in the HTML document string):

    package com.example;
    
    import javafx.application.Application;
    import javafx.scene.Scene;
    import javafx.scene.web.WebView;
    import javafx.stage.Stage;
    
    public class Main extends Application {
    
      private static final String HTML_PAGE =
          """
              <!DOCTYPE html>
              
              <html>
                <head>
                  <title>Unicode Rendering Test</title>
                  <meta charset="UTF-8"/>
              
                  <style>
                  body {
                    font-family: 'Segoe UI', sans-serif;
                  }
                  </style>
              
                </head>
                <body>
                  <h1>Welcome</h1>
                  <p>გამარჯობა, ეს არის ტესტი.</p>
                </body>
              </html>
              """;
    
      @Override
      public void start(Stage primaryStage) {
        var webView = new WebView();
        webView.getEngine().loadContent(HTML_PAGE);
    
        primaryStage.setScene(new Scene(webView, 500, 300));
        primaryStage.show();
      }
    
      public static void main(String[] args) {
        launch(Main.class);
      }
    }
    

    And the Maven POM file used to run the above (unchanged from POM shown earlier):

    <?xml version="1.0" encoding="UTF-8"?>
    
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>com.example</groupId>
        <artifactId>example</artifactId>
        <version>0.1.0</version>
    
        <properties>
            <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
            <maven.compiler.release>24</maven.compiler.release>
            <javafx.version>24.0.1</javafx.version>
        </properties>
    
        <build>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <version>3.14.0</version>
                </plugin>
    
                <plugin>
                    <groupId>org.openjfx</groupId>
                    <artifactId>javafx-maven-plugin</artifactId>
                    <version>0.0.8</version>
                    <configuration>
                        <mainClass>com.example.Main</mainClass>
                        <options>
                            <option>--enable-native-access=javafx.graphics,javafx.media,javafx.web</option>
                            <option>--sun-misc-unsafe-memory-access=allow</option>
                        </options>
                    </configuration>
                </plugin>
            </plugins>
        </build>
    
        <dependencies>
            <dependency>
                <groupId>org.openjfx</groupId>
                <artifactId>javafx-web</artifactId>
                <version>${javafx.version}</version>
            </dependency>
        </dependencies>
    </project>
    

    Output of mvn javafx:run:

    Screenshot of application running which shows the Georgian text being rendered.

    Note you can define the font-family in an external stylesheet and set it to the WebEngine.userStylesheetLocation property. This will apply the fix to all webpages loaded by the WebEngine. Though keep in mind that if the webpage itself defines a font-family then that value will override the user stylesheet.

    It may be that WebEngine.userStylesheetLocation is intended to be the default stylesheet that each browser should have. Which means it's the responsibility of the application developer to add one. However, the fact that defining the font family as something like sans-serif doesn't work is (likely) still a bug.