I have conducted a wordfish analysis with quanteda.
Through the function textplot_scale1d() I was able to plot the estimated thetas.
# run wordfish
tmod_wf <- textmodel_wordfish(example_DFM, dir = c(2, 1))
# plot the estimated thetas (scaled/ordered):
# y axis -> the documents
# x axis -> the estimated thetas
tmod_wf %>%
textplot_scale1d(margin = c("documents", "features"))
Now I would like to plot a graph in which the estimated thetas appear in the y axis and the documents appear in the x axis (not ordered based on their estimated thetas).
plot(tmod_wf$theta, xlab = "Documents", ylab = "Estimated_thetas")
the above line creates the following scatterplot:
scatterplot (thetas and documents)
The thetas are on the y axis while the documents are on the x axis (ordered as they appear in the corpus). The scatterplot suits my needs, however is rather empty, I would like to embellish it: highlight some documents, change the dots' size etc...
Is it possible to convert such a plot to ggplot?
I tried the following:
tmod_wf$theta %>%
ggplot() +
geom_point(aes(x = docs, y = theta))
However, the following error appears:
Error in `fortify()`:
! `data` must be a <data.frame>, or an object coercible by `fortify()`, not a double
vector.
Your data has to be data.frame
Try to do this:
tmod1 <- textmodel_wordfish(quanteda::data_dfm_lbgexample, dir = c(1,5))
# extract theta and docs
df = tibble(theta = tmod1[["theta"]],
docs = tmod1[["docs"]])
Then plot with ggplot:
df %>%
ggplot(aes(x = theta, y = docs, fill = -theta)) +
geom_col() +
scale_fill_distiller(palette = "RdYlGn") +
theme_minimal() +
theme(legend.position = "none")