I am attempting to use Gensim's Mallet wrapper. When I run the following code:
import os
import gensim
os.environ.update({
'MALLET_HOME':
r":C\Users\me\OneDrive - My Company\Documents\Projects\Current\mallet-2.0.8"
})
lda_mallet = gensim.models.wrappers.LdaMallet(
r"C:\Users\me\OneDrive - My Company\Documents\Projects\Current\mallet-2.0.8\bin\mallet",
corpus=corpus,
num_topics=10,
id2word=id_dict)
I am thrown the following errors:
'C:\Users\me\OneDrive' is not recognized as an internal or external command,
operable program or batch file.
subprocess.CalledProcessError: Command 'C:\Users\me\OneDrive - My Company\Documents\Projects\Current\mallet-2.0.8\bin\mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input C:\Users\me\AppData\Local\Temp\17fe21_corpus.txt --output C:\Users\me\AppData\Local\Temp\17fe21_corpus.mallet' returned non-zero exit status 1.
After exhaustive online searches, I have found many proposed solutions that unfortunately do not resolve my issue.
Since the first error message does not print the entire path, I believe the spaces are the cause of the issue.
Unfortunately, my company requires that I use this directory and I cannot change the name. Is there a way to "escape" the spaces in order to run my code?
Well, that's easy, LdaMallet
class is a badly written piece of software, report this as a bug to its creators.