I'm currently learning my ways around R and Im troubled by the following problem:
Ive got a dataframe that is build up like this
word freq1 freq2
tree 10 20
this 2 3
that 4 5
...
It shows the frequency in which the word is used in text 1 (freq1) and text 2 (freq2). Is it possible to transform this to a term-document-matrix? I need it to be a term-document-matrix to apply the following function
par(mfrow=c(1,1))
comparison.cloud(tdm, random.order=FALSE, colors =
c("indianred3","lightsteelblue3"),
title.size=2.5, max.words=400)
from https://rpubs.com/brandonkopp/creating-word-clouds-in-r
Thanks :)
EDIT: After reshaping your data:
library(reshape2)
library(tm)
library(dplyr)
library(wordcloud)
df2<-df %>%
gather("Origin","Freq",c(2,3)) %>%
acast(word~Origin,fill=0,value.var = "Freq")
comparison.cloud(df2, random.order=FALSE, colors = c("indianred3","lightsteelblue3"),
max.words=400)
Original answer: There is something wrong with your data as it stands. Here is a basic workflow leading up to either a wordcloud or comparison cloud.
library(tm)
library(dplyr)
library(wordcloud)
df<-read.table(text="word freq1 freq2
Tree 10 20
This 2 3
That 4 5",header=T)
df$word<-as.character(df$word)
df1<-df %>%
gather()
corpus_my<-Corpus(VectorSource(df1))
tdm<-as.matrix(TermDocumentMatrix(corpus_my))
comparison.cloud(tdm, random.order=FALSE, colors = c("indianred3","lightsteelblue3"),
max.words=400)
This gives which is not what you expect. I would suggest restructuring your data first: