My target is to display special letters of message as regular text after using StringEscapeUtils.escapeHtml4
.
Text example:
<html>
<body>
<p>éô</p>
</body>
</html>
My expected result is to make all the HTML tags being escaped, but not the text, that is here: éô
Code example:
String original = "<html><head><\\head><>éô";
System.out.println("original: " + original);
String translated = StringEscapeUtils.escapeHtml4(original);
System.out.println("translated: " + translated);
Output:
original: <html><head><\head><body>éô
translated: <html><head><\head><body>éô
I am expect to get:
<html><head><\head><body>éô
I think that I found the solution that mentioned here: Escape HTML in Languages with Accented Letters
by creating a custom escaping method that will use only two lookup translators:
public static final CharSequenceTranslator ESCAPE_HTML4_CUSTOM =
new AggregateTranslator(
new LookupTranslator(EntityArrays.BASIC_ESCAPE()),
new LookupTranslator(EntityArrays.HTML40_EXTENDED_ESCAPE())
);
In the original method StringEscapeUtils.escapeHtml4
there are:
public static final CharSequenceTranslator ESCAPE_HTML4 =
new AggregateTranslator(
new LookupTranslator(EntityArrays.BASIC_ESCAPE()),
new LookupTranslator(EntityArrays.ISO8859_1_ESCAPE()),
new LookupTranslator(EntityArrays.HTML40_EXTENDED_ESCAPE())
);