javajarpathjaws-wordnet

WordNet Database Directory Relative Path in .jar


To use JAWS the specification of the WordNet database directory within the application is needed as stated here: http://lyle.smu.edu/~tspell/jaws/.

System.setProperty("wordnet.database.dir", "resources/WordNet-3.0/dict/");

This works well within an IDE. But not compiled to a .jar file. Therefore, the relative path of the WordNet directory within the .jar file is needed. Tried already various approaches. The last is:

String url = this.getClass().getClassLoader()
            .getResource("resources/WordNet-3.0/dict/index.sense").toExternalForm();
url = url.substring(0, url.length() - 12);

Which has as result tested from the .jar file:

jar:file:/C:/file/path/jar.jar!/resources/WordNet-3.0/dict/

Thought this would be the solution, but still the directory or rather the index.sense can not be found when the .jar is run. The index.sense is supposed to load through:

WordNetDatabase database = WordNetDatabase.getFileInstance();

Which makes use of the specified directory. Found a couple of threads but no suggestions of those worked. For example:

How to get the path of a running JAR file? How to get a path to a resource in a Java JAR file Loading a file relative to the executing jar file


Solution

  • You cannot path into a JAR file, there is no virtual file system used by the WordNet API. So it cannot peer inside of the resource URL you are trying to generate. The WordNet API uses a File and FileReader wrapped in BufferedReader none of which can open a filename such as file:///some/dir/to/file.jar!something/in/a/jar.

    What you can do instead, is have your program first find the correct JAR using a method like you have above for locating the resources in some jar XYZ, then unzip the relevant contents of the XYZ JAR file into a temporary cache directory, and then point WordNet wordnet.database.dir System property at that cache directory.

    Depending on what JDK version you are using, depends on the best way to Unzip a JAR from within Java, but there are plenty of examples out there for doing so.

    Alternative method is to modify the code you are using (JAWS library) to load resources from streams instead of files if you provide something like classpath:relative/path/in/jars format of wordnet.database.dir property. Otherwise you would treat it as a file. Just find all occurrences of file reading in that library and replace the creation of the stream with a common utility function that returns a BufferedReader wrapping the correct type of input stream.