postgresqlindexingpg-catalog

Extract Data Definitions for Indexes from PostgreSQL using `pg_index` table


I'm trying to extract data definitions from a PostgreSQL database for indexes into the application.

The structure in my application, currently, looks something like this:

type IndexColExpr = {
  col_or_expr: string;
};
type IncludeColumn = {
  name: string;
};
type Index = {
  id: number;
  table_id: number;
  table_name: string;
  name: string;
  unique: boolean;
  method: string;
  columns: IndexColExpr[];
  include_columns: IncludeColumn[];
  where: string;
};

I'm using the pg_class and pg_index to get the information that I need.

So far everything is going well until I get to the part where the index is on expressions (not a simple column name). Those expressions are stored in pg_index.indexprs which is of the type pg_node_tree.

The problem is when there are multiple expressions in the index; they will all be encoded inside the indexprs column as a single pg_node_tree value and I need them separated. That means I have to parse pg_node_tree to get individual expressions. Or I can convert pg_node_tree to SQL string using pg_get_expr(), but I still have to parse the SQL DDL script. Since they are in a list format (expr1, expr2) I can't simply cast to text[]. Both pg_node_tree and the DDL script can contain arbitrary expressions, so I can't use a simple split-by-comma to get them.

Is there something in PostgreSQL that will allow me to extract individual expressions without parsing pg_node_tree or the DDL script?

If there isn't, then maybe I shouldn't bother with pg_index and just parse the CREATE INDEX query provided by pg_get_indexdef()?


Solution

  • There are two overloaded variants of pg_get_indexdef():

    1. Passing only the index OID, you get a complete reverse-engineered CREATE INDEX command.

    2. If you also pass the smallint field number of the index expression ( = pg_attribute.attnum) and a boolean "pretty" flag, you get the definition of the given index column. As the manual puts it:

      If column is supplied and is not zero, only the definition of that column is reconstructed.

    To get all index columns of a given index separately and in order (plain column names and expressions alike):

    SELECT pg_get_indexdef(a.attrelid, a.attnum, true)
    FROM   pg_attribute a
    WHERE  a.attrelid = 'my_index'::regclass  -- your index name here
    ORDER  BY a.attnum;
    

    I am going to pg_attribute directly, no need to involve pg_index. If you need the connection:

    pg_index.indexrelid = pg_attribute.attrelid
    

    If there can be ambiguity, schema-qualify the index name:

    'my_schema.my_index'::regclass
    

    This does not include any decorators per index column like DESC NULLS LAST, operator class, collation, ... Those are encoded separately in indoption, indclass, indcollation of the table pg_index.