rtm

Text mining- how to build a term-document matrix


What I am trying to do is to load a csv file, and convert to an term-document matrix.

Here is part of my code:

myCorpus<-read.csv('alert-sample-data-4-mining.csv', head=TRUE)
TermDocumentMatrix(myCorpus, control=list(wordLengths=c(1,Inf)))

But get an error message said: Error in UseMethod("TermDocumentMatrix", x) : no applicable method for 'TermDocumentMatrix' applied to an object of class "data.frame"


Solution

  • A few things here -- you're not loading the tm library and you're not creating a corpus. Try something like this (assuming your text data is in a field called "text" in the csv file):

    library(tm)
    myCorpus <- read.csv("alert-sample-data-4-mining.csv")
    corpus <- Corpus(VectorSource(myCorpus$text))
    TermDocumentMatrix(corpus)