pythonpython-3.xnlpldatopic-modeling

Topic modeling on short texts Python


I want to do topic modeling on short texts. I did some research on LDA and found that it doesn't go well with short texts. What methods would be better and do they have Python implementations?


Solution

  • You can try Short Text Topic Modelling (refer to this https://www.groundai.com/project/sttm-a-tool-for-short-text-topic-modeling/1) (code available at https://github.com/qiang2100/STTM) . It combine state-of-the-art algorithms and traditional topics modelling for long text which can conveniently be used for short text.

    For more specialised libraries, try lda2vec-tf, which combines word vectors with LDA topic vectors. It is branched from the original lda2vec and improved upon and gives better results than the original library.