pythonsqldatabase-schemasqlglot

Node-level documentation for sqlglot


Using the Python library sqlglot, where can I find documentation that explains:

  1. Which attributes I should expect to find on which expression nodes types (which arg types does Join, Table, Select, etc. have?)
  2. What overall structure I should expect the AST to have for various kinds of SQL statements? (e.g. that a Select has a "joins" child, which in turn has a list of tables) And what "arg" name do I use to access each of these?

For example, what documentation could I look at to know that code like below (from here) will find the names of table within the joins? How would I know to request "joins" from node.args? What does "this" refer to?

    node = sqlglot.parse_one(sql)
    for join in node.args["joins"]:
        table = join.find(exp.Table).text("this")

My use case is to parse a bunch of Hive SQL scripts in order to find FROM, INSERT, ADD/DROP TABLE statements/clauses within the scripts, for analyzing which statements interact with which tables. So I am using sqlglot as a general-purpose SQL parser/AST, rather than as a SQL translator.

I have generated a copy of the pdocs locally, but it only tells me which Python API methods are available on the Expression nodes. It does not seem to answer the questions above, unless I am looking in the wrong place.


Solution

  • You can look in the expressions.py file

    https://github.com/tobymao/sqlglot/blob/main/sqlglot/expressions.py

    every expression type has arg_types