machine-learningwekatext-classificationfeature-selection

How can i apply feature reduction methods in Weka?


  1. How can i apply feature reduction methods like LSI etc in weka for text classification?

  2. Can feature reduction methods like LSI etc improve the accuracy of classification?


Solution

    1. Take a look at FilteredClassifier class or at AttributeSelectedClassifier. With FilteredClassifier you can use such features reduction method as Principal Component Analysis (PCA). Here is a video how to filter your dataset using PCA, so that you could try different classifiers on reduced dataset.

    2. It can help, but there is no guarantee about that. If you remove redundant features, or transform features in some way (like SVM or PCA do) classification task can become simpler. Anyway big number of features usually lead to curse of dimensionality and attribute selection is a way to avoid it.