rtmzipf

Zipf_plot() : How to compare two objects in one graph?


I'm trying to use the Zipf_plot function from the tm package to compare two different document-term-matrices - and I'm not an R expert .. Maybe you could tell me, if there's a way to fit both in this function?

Zipf_plot(x, type = "l", ... )

I know, there's a possibility to get both (or more) of them in one window:

par(mfrow=c())

but I'd really appreciate a solution with two or more dtms in one graph.

Thanks in advance! :-)


Solution

  • You could try par(new=T) or try to adjust the function according to your needs, e.g.:

    library(tm)
    data("acq")
    data("crude")
    m1 <- DocumentTermMatrix(acq)
    m2 <- DocumentTermMatrix(crude)
    Zipf_plot(m1, col = "red")
    par(new=T)
    Zipf_plot(m2, col="blue")
    Zipf_plot_multi <- function (xx, type = "l", cols = rainbow(length(xx)), ...) {
        stopifnot(is.list(xx) & length(xx)==length(cols))
        for (idx in seq_along(xx)) {
          x <- xx[[idx]]
          if (inherits(x, "TermDocumentMatrix")) 
              x <- t(x)
          y <- log(sort(slam::col_sums(x), decreasing = TRUE))
          x <- log(seq_along(y))
          m <- lm(y ~ x)
          dots <- list(...)
          if (is.null(dots$xlab)) 
              dots$xlab <- "log(rank)"
          if (is.null(dots$ylab)) 
              dots$ylab <- "log(frequency)"
          if (idx==1) {
            do.call(plot, c(list(x, y, type = type, col = cols[idx]), dots))
          } else {
            lines(x, y, col = cols[idx])
          }
          abline(m, col = cols[idx], lty = "dotted")
          print(coef(m))
        }
    }
    Zipf_plot_multi(list(m1, m2), xlim=c(0, 7), ylim=c(0,6))
    

    enter image description here