google-colaboratoryfacetvegaword-cloudtrellis

Vega Wordcloud Faceting


i really like the look of the vega Word Clouds: https://vega.github.io/vega/examples/word-cloud/

I'm currently using the spec from the link as follows in colab:

spec = "insert spec here"

#Option one:
from altair import vega
vega.renderers.enable('colab')
vega.Vega(spec)

#Option two:
import panel as pn
from vega import Vega
pn.extension('vega')
pn.pane.Vega(spec)

But actually i want to make faceted wordclouds with vega. I currently load my data as json from my github account which is also slightly annoying, but i found no way to reference python variables in the vega spec. Does anyone maybe have a hint, how i could layout the vega wordcloud in a grid by groups specified in my data? My json has this structure: [{"text":text,"group":group}], drawing the wordclouds from this works, but not the faceting by the group field. I know vega-lite can do faceting, but it can't draw the beautiful wordcloud it seems.

Thanks for any help!


Solution

  • Here is a working example of Vega spec using facet with your data.

    For illustration only, the formula field for angle places words with larger field size in horizontal position.

    View in Vega online editor enter image description here

    {
      "$schema": "https://vega.github.io/schema/vega/v5.json",
      "description": "A word cloud visualization depicting Vega research paper abstracts.",
      "title": "A Wordcloud",
      "width": 400,
      "height": 400,
      "padding": 10,
      "background": "ghostwhite",
    
      "layout": {
        "bounds": "flush",
        "columns": 2,
        "padding": 10
      },
    
      "data": [
        {
          "name": "table",
          "url": "https://raw.githubusercontent.com/nyanxo/vega_facet_wordcloud/main/split.json",
      
          "transform": [
            {
              "type": "formula", 
              "as": "angle",
              "expr": "datum.size >= 3 ? 0 : [-45,-30, -15, 0, 15, 30, 45][floor(random() * 7)]"
              }
            ]
        }
      ],
      "scales": [
        {
          "name": "color",
          "type": "ordinal",
          "domain": {"data": "table", "field": "text_split"},
          "range": ["#d5a928", "#652c90", "#939597"]
        }
      ],
    
      "marks": [
       {
          "type": "group",
          "from": {
            "facet": {
              "name": "facet",
              "data": "table",
              "groupby": "group"
            }
          },
    
          "title": {
            "text": {"signal": "parent.group"},
            "frame": "group"
          },
    
          "encode": {
            "update": {
              "width": {"signal": "width"},
              "height": {"signal": "height"}
            }
          },
    
    "marks": [
        {
          "type": "rect",
          "encode": {
            "enter": {
              "x": {"value": 0},
              "width": {"signal": "width" },
              "y": {"value": 0},
              "height": {"signal": "height"},
              "fill": {"value": "beige"}
            }
          }
        },
    
        {
          "type": "text",
          "from": {"data": "facet"},
          "encode": {
            "enter": {
              "text": {"field": "text_split"},
              "align": {"value": "center"},
              "baseline": {"value": "alphabetic"},
              "fill": {"scale": "color", "field": "text_split"}
            },
            "update": {"fillOpacity": {"value": 1}},
            "hover": {"fillOpacity": {"value": 0.5}}
          },
          "transform": [
            {
              "type": "wordcloud",
              "size": {"signal": "[width, height]"},
              "text": {"field": "text_split"},
              "rotate":  {"field": "datum.angle"},
              "font": "Helvetica Neue, Arial",
              "fontSize": {"field": "datum.size"},
              "fontSizeRange": [12, 28],
              "padding": 2
            }
          ]
        }
        ]
       }
      ]
    }