pythonnlptext-classificationtext-analysis

What features are good for Sentence Classification apart from using vector representation like Bag-of-words?


I am trying to find whether a given sentence is a "question request", "call for action", etc. I am using supervised multilabel classification for that.

What will be a good set of features to use? I am currently using Bag-of-words with trigrams, modal verbs, question words, etc. but the result is not that good.

Input example: "Can you get this today? I need following items."


Solution

  • https://code.google.com/p/word2vec/ is probably a good feature.

    Illinois Wilkifier can also be very helpful: http://cogcomp.cs.illinois.edu/demo/wikify/?id=25

    Also take a look at features used for Dataless classification: http://cogcomp.cs.illinois.edu/page/project_view/6