nlpgensimlda

Cannot see DEBUG logs for “number of documents converged“ info when running Gensim's LDA suggested for choosing iterations and passes


In official Gensim tutorial there is a mention about how to set number of iterations and passes:

I suggest the following way to choose iterations and passes. First, enable logging (as described in many Gensim tutorials), and set eval_every = 1 in LdaModel. When training the model look for a line in the log that looks something like this:

2016-06-21 15:40:06,753 - gensim.models.ldamodel - DEBUG - 68/1566 documents converged within 400 iterations

I've never saw anything like this line in my LDA logs though. Those are my logs on Pastebin. I've folowed the official tutorial.

I'm alllowing debugging like this:

logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO,
                            filename='content_based_algorithms/training_logs/lda/logs.log')

I even tried to explicitly define callbacks::

perplexity_logger = PerplexityMetric(corpus=corpus, logger='shell')
convergence_logger = ConvergenceMetric(logger='shell')

lda_model = gensim.models.LdaModel(corpus=corpus, id2word=dictionary, num_topics=num_topics, passes=passes, alpha=alpha, eta=eta, update_every=1, eval_every=1, callbacks=[convergence_logger, perplexity_logger])

I've tested that both in Windows, PyCharm IDE and Ubuntu command line execution of Python cript.

Possible duplicate with the post Gensim LDA logging not displaying


Solution

  • The line

    DEBUG - 68/1566 documents converged within 400 iterations

    Can be obtained in the logging file changing the logging configuration to debug in you case would be something like this:

    logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',
                       level=logging.DEBUG,
                       filename='content_based_algorithms/training_logs/lda/logs.log')
    

    Now the line will appear inside the logging file.