javaandroid-studionioandroid-resourcesopennlp

OpenNLP in Android getting FileNotFoundException when trying to initialize posModel


I am using open nlp in my project. I would like to use the dictionary lemmatizer but i'm unable to input the posModel correctly i have a summaryActivity class that calls this function on the click of a button

private String summaryTool(String documentText) throws Exception {
        InputStream sentInput = getAssets().open("en_sent.bin");
        InputStream tokenInput = getAssets().open("en_token.bin");
        InputStream lemmaInput = getAssets().open("en_lemmatizer.dict");
        FileInputStream posInput = getApplicationContext().openFileInput("en_pos_maxent.bin");

        preProcessor = new PreProcessor(sentInput, tokenInput, lemmaInput, posInput);
        grapher = new Grapher();

        sentInput.close();
        tokenInput.close();
        lemmaInput.close();
        posInput.close();

And my Preprocessor class constructor where the posModel is initialized

 public PreProcessor(InputStream sentenceModel, InputStream tokenizerModel, InputStream lemmaModel, InputStream pos) throws IOException {
        SentenceModel sentModel = new SentenceModel(sentenceModel);
        sentenceDetector = new SentenceDetectorME(sentModel);
        TokenizerModel tokenModel = new TokenizerModel(tokenizerModel);
        tokenizer = new TokenizerME(tokenModel);
//line with exception
        POSModel posModel = new POSModel(pos);
        posTagger = new POSTaggerME(posModel);
        lemmatizer = new DictionaryLemmatizer(lemmaModel);
    }

This is what my src folder looks like: i have put en_pos_maxent.bin file in multiple places while trying to load it

I have tried using getAssets().open("en_pos_maxent.bin"); but i receive a wrong input stream format exception

W/System.err: opennlp.tools.util.InvalidFormatException: The profile data stream has an invalid format!
        at opennlp.tools.dictionary.serializer.DictionaryEntryPersistor.create(DictionaryEntryPersistor.java:224)
        at opennlp.tools.postag.POSDictionary.create(POSDictionary.java:228)
        at opennlp.tools.postag.POSTaggerFactory$POSDictionarySerializer.create(POSTaggerFactory.java:296)
        at opennlp.tools.postag.POSTaggerFactory$POSDictionarySerializer.create(POSTaggerFactory.java:293)
        at opennlp.tools.util.model.BaseModel.finishLoadingArtifacts(BaseModel.java:312)
        at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:242)
        at opennlp.tools.util.model.BaseModel.<init>(BaseModel.java:176)
        at opennlp.tools.postag.POSModel.<init>(POSModel.java:97)
        at com.mtah.tools.PreProcessor.<init>(PreProcessor.java:46)
        at com.mtah.summerizer.SummaryActivity.summaryTool(SummaryActivity.java:57)
        at com.mtah.summerizer.SummaryActivity.onCreate(SummaryActivity.java:36)
        at android.app.Activity.performCreate(Activity.java:7820)
        at android.app.Activity.performCreate(Activity.java:7809)
        at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1318)
        at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:3362)
        at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:3526)
        at android.app.servertransaction.LaunchActivityItem.execute(LaunchActivityItem.java:83)
        at android.app.servertransaction.TransactionExecutor.executeCallbacks(TransactionExecutor.java:135)
        at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:95)
        at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2122)
        at android.os.Handler.dispatchMessage(Handler.java:107)
        at android.os.Looper.loop(Looper.java:214)
        at android.app.ActivityThread.main(ActivityThread.java:7695)
W/System.err:     at java.lang.reflect.Method.invoke(Native Method)
        at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:516)
        at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:950)
    Caused by: org.xml.sax.SAXException: Can't create default XMLReader; is system property org.xml.sax.driver set?
        at org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:160)
        at opennlp.tools.dictionary.serializer.DictionaryEntryPersistor.create(DictionaryEntryPersistor.java:219)
        ... 25 more

I. have also try the soultion from this post but I get an exception java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.String java.util.Properties.getProperty(java.lang.String)' on a null object reference

My question is: How do i open the "en_pos_maxent.bin" file with FileInputStream correctly? it seem opennlps POSModel only accepts InputStream of type FileInputStream. Sorry if i have left out any info, i do not post often please let me know and i will include it. Any help would be appreciated.


Solution

  • This answer from this post worked for me.

    System.setProperty("org.xml.sax.driver", "org.xmlpull.v1.sax2.Driver");
        try {
            AssetFileDescriptor fileDescriptor = 
                getApplicationContext.getAssets().openFd("en_pos_maxent.bin");
            FileInputStream inputStream = fileDescriptor.createInputStream();
            POSModel posModel = new POSModel(inputStream);
            posTaggerME = new POSTaggerME(posModel);
        } catch (Exception e) {
            //Handle exception
        }