rtmcorpusterm-document-matrix

TermDocumentMatrix Error after Cleaning Corpus


My problem is that I want to pass my corpus to the tm function termdocumentmatrix() and it fails with the error: Error in UseMethod("meta", x): no applicable method for meta' applied to an object of class "character".

To begin with, I have a Dataframe named "auth" that looks like this:

Author Messages
014588 Hi; How are you
123341 Hello; Fine u?
857635 The weather is fine; It looks Sunny; There are some clouds

The Author is self-explaining and the messages are all written by the specific author. The different Messages are separated by a semicolon. The Code which transforms the dataframe to a corpus and cleans it looks like this:

auth_text <- auth$messages
auth_text2 <- replace_abbreviation(auth_text)
auth_source <- VectorSource(auth_text2)
auth_corp <- VCorpus(auth_source)

clean_corpus <- function(corpus) {
  corpus <- tm_map(corpus, removePunctuation)
  corpus <- tm_map(corpus, content_transformer(tolower))
  corpus <- tm_map(corpus, PlainTextDocument)
  corpus <- tm_map(corpus, removeWords, new_stop)
  corpus <- tm_map(corpus, stripWhitespace)
  corpus <- tm_map(corpus, bracketX)
  
  return(corpus)
}

clean_corp <- clean_corpus(auth_corp)

After cleaning the corpus it should be processed by:

corp_tdm <- TermDocumentMatrix(clean_corp)

After starting the command the error message pops up as above described. I can't even view the corpus anymore. Could anyone help me out with this?


Solution

  • Removing corpus <- tm_map(corpus, bracketX) did the job and the code is now functioning correctly