I am using pyLDAvis along with gensim.models.LdaMulticore for topic modeling. I have totally 10 topics. When I visualize the results using pyLDAvis, there is a bar called lambda with this explanation: "Slide to adjust relevance metric". I am interested to extract the list of words for each topic separately for lambda = 0.1. I cannot find a way to adjust lambda in the document for extracting keywords.
I am using these lines:
if 1 == 1:
LDAvis_prepared = pyLDAvis.gensim_models.prepare(lda_model, corpus, id2word, lambda_step=0.1)
LDAvis_prepared.topic_info
And these are the results:
Term Freq Total Category logprob loglift
321 ra 2336.000000 2336.000000 Default 30.0000 30.0000
146 may 1741.000000 1741.000000 Default 29.0000 29.0000
66 doctor 1310.000000 1310.000000 Default 28.0000 28.0000
First of all these results are not related to what I observe with lambda of 0.1 in visualization. Secondly I cannot see the results separated by the topics.
You may want to read this github page: https://nicharuc.github.io/topic_modeling/
According to this example, your code could go like this:
lambd = 0.6 # a specific relevance metric value
all_topics = {}
num_topics = lda_model.num_topics
num_terms = 10
for i in range(1,num_topics+1): ## Indecies are 1-based, not 0-based
topic = LDAvis_prepared.topic_info[LDAvis_prepared.topic_info.Category == 'Topic'+str(i)].copy()
topic['relevance'] = topic['loglift']*(1-lambd)+topic['logprob']*lambd
all_topics['Topic '+str(i)] = topic.sort_values(by='relevance', ascending=False).Term[:num_terms].values
pd.DataFrame(all_topics).T