rtidyquantedaqdaptidytext

How to Cast a Dataframe into a DTM


I'd like to cast my table into a DTM and maintain the metadata.

Same Data

Each row should be a document. But in order to use the cast_dtm(), there needs to be a count variable. In order to "cast", it needs to be in the "Document, Term, Count" format.

How do I convert my data into the "Document, Term, Count" dataframe? From there, it's easy to cast into a DTM, and then do what I need.


Solution

  • try this

    library(tm)
    myCorpus <- Corpus(VectorSource(df))  
    dtm <- DocumentTermMatrix(myCorpus)
    

    I've used the above code for a textmining project except I had df replaced with df$column