pythonsubprocessmallet

How fix this error: returned non-zero exit status 1 in Mallet?


Please help me with the following error. I tried a lot to fix it but with no help. The code:

MALLET_PATH = './Mallet/bin/mallet'



def topic_model_coherence_generator(corpus, texts, dictionary, start_topic_count=2, end_topic_count=10, step=1,cpus=1):
    models = []
    coherence_scores = []
    for topic_nums in tqdm(range(start_topic_count, end_topic_count + 1, step)):
        mallet_lda_model = gensim.models.wrappers.LdaMallet(mallet_path=MALLET_PATH, corpus=corpus, num_topics=topic_nums, id2word=dictionary, iterations=500, workers=cpus)

        cv_coherence_model_mallet_lda = gensim.models.CoherenceModel(model=mallet_lda_model, corpus=corpus, texts=texts, dictionary=dictionary, coherence='c_v')
        coherence_score = cv_coherence_model_mallet_lda.get_coherence()
        coherence_scores.append(coherence_score)
        models.append(mallet_lda_model)
    return models, coherence_scores

lda_models, coherence_scores = topic_model_coherence_generator(corpus=bow_corpus, texts=norm_corpus_bigrams, dictionary=dictionary, start_topic_count=2, end_topic_count=30, step=1, cpus=16)

The error:

0%|          | 0/29 [00:00<?, ?it/s]'.' is not recognized as an internal or external command,
operable program or batch file.
subprocess.CalledProcessError: Command './Mallet/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input C:\Users\yaiza\AppData\Local\Temp\b44add_corpus.txt --output C:\Users\yaiza\AppData\Local\Temp\b44add_corpus.mallet' returned non-zero exit status 1.

If I write MALLET_PATH = 'C:/Program files/Mallet/bin/mallet' or 'MALLET_PATH = 'rC:/Program files/Mallet/bin/mallet', return the same error but in first sentence changes to

0%|          | 0/29 [00:00<?, ?it/s]'C:/Program files' is not recognized as an internal or external command,
operable program or batch file..

Thank you


Solution

  • Make sure you installed the Java Developers Kit (JDK).

    The credit goes to this another answer

    After installing the JDK & restarting your computer for changes to take effect, the following codes for the LDA Mallet worked like charm!

    import os
    from gensim.models.wrappers import LdaMallet
    
    os.environ.update({'MALLET_HOME':r'C:/mallet/mallet-2.0.8/'})
    mallet_path = r'C:/mallet/mallet-2.0.8/bin/mallet.bat'
    
    lda_mallet = LdaMallet(
            mallet_path,
            corpus = corpus_bow,
            num_topics = n_topics,
            id2word = dct,
        )