javacharacter-encodingchardet

Java chardet that detects iso-8859-2


Is there a Java version of the python chardet that detects iso-8859-2? I've tried the Mozilla universalchardet and jchardet and neither worked, they both guessed windows-1252 but the python chardet that comes with Linux detected it just fine.


Solution

  • I made a good experience with IBM's ICU4J for the charset detection, in regards to ISO-8859-2 too (http://site.icu-project.org/), it was giving consistently the best (most accurate) results for the files we were using for the tests. I didn't come accross Java version of the python chardet when doing the research.