rplotword-cloud

How to plot wordcloud based on multiple columns?


How to make wordcloud plot based on two columns values? I have a dataframe as follows:

Name <- c("Jon", "Bill", "Maria", "Ben", "Tina", "Vikram", "Ramesh", "Luther")
Age <- c(23, 41, 32, 58, 26, 41, 32, 58)
Pval <- c(0.01, 0.06, 0.001, 0.002, 0.025, 0.05, 0.01, 0.0002)
df <- data.frame(Name, Age, Pval)

I want to make wordcloud plot for df$Name based on values in df$Age and df$Pval. I used following code:

library("tm")
library("SnowballC")
library("wordcloud")
library("wordcloud2")
library("RColorBrewer")
set.seed(1234)
wordcloud(words = df$Name, freq = df$Age, min.freq = 1,
          max.words=10, random.order=FALSE, rot.per=0.35, 
          colors=brewer.pal(8, "Dark2"))

enter image description here

Here Luther & Ben are of same size, but I need to make Luther to be slightly bigger than Ben as it has lower Pval.


Solution

  • A quick fix workaround:

    library("dplyr")
    library("scales")
    library("wordcloud")
    library("RColorBrewer")
    
    Name <- c("Jon", "Bill", "Maria", "Ben", "Tina", "Vikram", "Ramesh", "Luther")
    Age <- c(23, 41, 32, 58, 26, 41, 32, 58)
    Pval <- c(0.01, 0.06, 0.001, 0.002, 0.025, 0.05, 0.01, 0.0002)
    df <- data.frame(Name, Age, Pval)
    
    df <- df %>%
    group_by(Age) %>%
    mutate(rank = rank(Pval)) %>% #rank pvalue by age 
    mutate(weight = scales::rescale(rank/max(rank), to=c(0,1)))  %>%
    #this is just to make sure that we don't add more than one to the mix
    mutate(weight = Age + (1-weight) ) #because rank is inversed
    #the final thing adds 0.5 if there is not anyone with the same age and 1 if
    #there is someone else but you have a smaller p-val (it also should work if 
    # there is more than 2 person with the same age)
    
    set.seed(1234)
    wordcloud(words = df$Name, freq = df$weight, min.freq = 1,
          max.words=10, random.order=FALSE, rot.per=0.35, 
          colors=brewer.pal(8, "Dark2"))
    

    Fun and interesting question btw