postgresqlsecondary-indexes

What is the definition of secondary index in postgresql?


From https://www.postgresql.org/docs/9.6/static/indexes-index-only-scans.html:

All indexes in PostgreSQL are secondary indexes, meaning that each index is stored separately from the table's main data area (which is called the table's heap in PostgreSQL terminology).

In postgresql, is a secondary index defined as an index which is stored separately from the table's main data area?

If not, what is its definition, and why does the quote mention the one which is not a definition?

Similarly, what is the definition of a primary index?

Are the concepts in postgresql the same as in Oracle database?

Thanks.


Solution

  • There is some lack of precision in the definitions of primary & secondary indices.

    Using two popular university texts as reference:

    Fundamentals of Database Systems, Elmasri & Navathe defines them as:

    A primary index as an index on an ordered file where the search key is the same as the sort key

    A secondary index provides a secondary means of accessing a data file for which some primary access already exists. The data file records could be ordered, unordered, or hashed.

    Database Systems: The Complete Book, Garcia-Molina et. al defines them as:

    A primary index determines the location of the records of the data file

    The secondary index is distinguished from the primary index in that a secondary index does not determine the placement of records in the data file. Rather, the secondary index tells us the current locations of records; that location may have been decided by a primary index on some other field

    Some properties that hold true for either definition above:

    However, if the placement of records in the data file is not determined by any field, then a primary index cannot be constructed.

    Thus for sorted files, it makes sense to talk about the primary index (which would be the list of fields upon which the sorting is based). I can't find other examples of physical file structures where a primary index can be constructed.

    Postgresql utilizes a heap structure for the physical storage for records. Heaps are not sorted (pun alert: they're sorta sorted). Therefore, even the primary keys are implemented using secondary indices, and as such all indices in Postgresql are secondary.

    Other RDBMS Systems do implement storage formats that support primary indices:


    The language in the Postgres Documentation is imprecise.

    All indexes in PostgreSQL are secondary indexes

    This is true.

    meaning that each index is stored separately from the table's main data area

    This is not why all indices are secondary in Postgresql. Primary indices may also be stored separately from the table's main data area.