I am using PyLDAvis to visualise the results of the LDA from Mallet.
Before I can do that, I need the wrapper of the gensim library:
model = gensim.models.wrappers.ldamallet.malletmodel2ldamodel(model_list[8])
When I print the found topics, they are ordered from 0-10.
However when I am using the pyLDAvis to visualise the Topics, the Topic order (0-10), does not align with printed topics.
Example:
(5,
'0.042*"euro" + 0.030*"smartpho" + 0.022*"camera" + 0.020*"display" + '
'0.018*"model" + 0.016*"picture" + 0.012*"price" + 0.010*"android"')
As you can see this topic is about smartphones.
However when I visualise the model with pyLDAvis, Topic 5 is not about smartphones, but about another Topic (cars for example). The smartphone topic is not 5 anymore but topic 1.
Example1:
Is this a known error or is this the normal? Somebody can help?
By default, pyLDAvis sorts the topics by topic proportion -- To keep the original sort order, pass sort_topics=False
to pyLDAvis.prepare()
. Note that the pyLDAvis topics will still be off by one (i.e., Topic 1 in pyLDAvis will be Topic 0 from gensim).
There is a similar question here: Is there any way to match Gensim LDA output with topics in pyLDAvis graph?
And an associated issue on the pyLDAvis repo: https://github.com/bmabey/pyLDAvis/issues/127