I'm having an encoding issue with JavaFX's WebView
. When loading a UTF-8 encoded file, special characters are displayed incorrectly (e.g. ’
is displayed instead of ’
). Here's an SSCCE:
WebViewTest.java
import javafx.application.Application;
import javafx.scene.Scene;
import javafx.scene.web.WebView;
import javafx.stage.Stage;
public class WebViewTest extends Application {
public static void main(String[] args) {
Application.launch(args);
}
@Override
public void start(Stage stage) {
WebView webView = new WebView();
webView.getEngine().load(getClass().getResource("/test.html").toExternalForm());
Scene scene = new Scene(webView, 500, 500);
stage.setScene(scene);
stage.setTitle("WebView Test");
stage.show();
}
}
test.html
<!DOCTYPE html>
<html>
<body>
<p>RIGHT SINGLE QUOTATION MARK: ’</p>
</body>
</html>
Output of file -bi test.html
src:$ file -bi test.html
text/plain; charset=utf-8
The same thing happens in Windows using Java 17 and the latest JavaFX (I used Linux and Java 8 for the demonstration).
I've tried:
Declaring the charset in the HTML: <meta charset="UTF-8">
(works, but I'm making an editor program, so I don't have control over the HTML)
Using the JVM argument -Dfile.encoding=UTF-8
(doesn't work)
Setting the charset using reflection (doesn't work, and throws an exception in newer Java versions):
System.setProperty("file.encoding","UTF-8");
Field charset = Charset.class.getDeclaredField("defaultCharset");
charset.setAccessible(true);
charset.set(null,null);
Declaring the charset after the page loads using the DOM API (doesn't work):
webView.getEngine().getLoadWorker().stateProperty().addListener((o, oldState, newState) -> {
if(newState == Worker.State.SUCCEEDED) {
Document document = webView.getEngine().getDocument();
Element meta = document.createElement("meta");
meta.setAttribute("charset", "UTF-8");
document.getElementsByTagName("html").item(0).appendChild(meta);
}
});
Using WebEngine.loadContent(String)
instead of load(String)
(wouldn't work; relative links would be broken)
It appears that WebView
ignores file encodings, and uses ISO-8859-1 unless a charset is specified in the HTML.
WebView determines the encoding from either the HTML file or the HTTP header. This is as per the w3c specification, for information see:
As you already noted in your question, you can declare the character encoding in the head element within the HTML document and the WebView will pick it up:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
...
But, you also note in your question that you don't have control over the input HTML files and whether it includes the necessary header for declaring the charset.
You can also have the HTTP protocol specify the encoding of the file using an appropriate header.
Content-Type: text/html; charset=UTF-8
If you do that, the HTML file content will be correctly UTF-8 decoded by the WebView, even if the input file does not include a charset header.
Here is an example:
import com.sun.net.httpserver.*;
import javafx.application.Application;
import javafx.scene.Scene;
import javafx.scene.web.WebView;
import javafx.stage.Stage;
import java.io.*;
import java.net.InetSocketAddress;
import java.nio.charset.StandardCharsets;
import java.util.List;
import java.util.stream.Collectors;
public class WebViewTest extends Application {
private static final String TEST_HTML = "test.html";
private HttpServer server;
public static void main(String[] args) {
Application.launch(args);
}
@Override
public void init() throws Exception {
server = HttpServer.create(new InetSocketAddress(8000), 0);
server.createContext("/", new MyHandler());
server.setExecutor(null); // creates a default executor
server.start();
}
@Override
public void start(Stage stage) {
WebView webView = new WebView();
webView.getEngine().load("http://localhost:8000/" + TEST_HTML);
Scene scene = new Scene(webView, 500, 500);
stage.setScene(scene);
stage.setTitle("WebView Test");
stage.show();
}
@Override
public void stop() throws Exception {
server.stop(0);
}
static class MyHandler implements HttpHandler {
public void handle(HttpExchange httpExchange) {
try {
String path = httpExchange.getRequestURI().getPath().substring(1); // strips leading slash from path, so resource lookup will be relative to this class, not the root.
String testString = resourceAsString(path);
System.out.println("testString = " + testString);
if (testString != null) {
httpExchange.getResponseHeaders().put("Content-Type", List.of("text/html; charset=UTF-8"));
httpExchange.sendResponseHeaders(200, testString.getBytes(StandardCharsets.UTF_8).length);
try (BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(httpExchange.getResponseBody()))) {
writer.write(testString);
writer.flush();
} catch (IOException e) {
e.printStackTrace();
}
} else {
System.out.println("Unable to find resource: " + path);
}
} catch (IOException e) {
e.printStackTrace();
}
}
private String resourceAsString(String fileName) throws IOException {
try (InputStream is = WebViewTest.class.getResourceAsStream(fileName)) {
if (is == null) return null;
try (InputStreamReader isr = new InputStreamReader(is);
BufferedReader reader = new BufferedReader(isr)) {
return reader.lines().collect(Collectors.joining(System.lineSeparator()));
}
}
}
}
}
For this example to work, place the HTML test file from your question in the same location as your compiled WebViewTest.class, so that it can be loaded from there as a resource.
To run the example as a modular app, add the following to your module-info.java (in addition to your javafx module requirements and any other app requirements you need):
requires jdk.httpserver;