postgresqlquery-planner

PostgreSQL n_distinct statistics setting


Are there multiple ways to set n_distinct in PostgreSQL? Both of these seem to be doing the same thing but end up changing a different value within pg_attribute. What is the difference between these two commands?

alter table my_table alter column my_column set (n_distinct = 500);
alter table my_table alter column my_column set statistics 1000;
select
    c.relname,
    a.attname,
    a.attoptions,
    a.attstattarget
from
    pg_class c
    inner join
    pg_attribute a
        on c.oid = a.attrelid
where
    c.relname = 'my_table'
and
    a.attname = 'my_column'
order by
    c.relname,
    a.attname;
Name         |Value           
-------------|----------------
relname      |my_table       
attname      |my_column         
attoptions   |{n_distinct=500}
attstattarget|1000            

Solution

  • Both of these seem to be doing the same thing

    Why would you say that? Both commands are obviously distinct. Both are related to column statistics and query planning. But they do very different things.

    The statistics target ...

    controls the level of detail of statistics accumulated for this column by ANALYZE. See:

    Basics in the manual.

    Setting n_distinct is something completely different. It means hard-coding the number (or ratio) of distinct values to expect for the given column. (But only effective after the next ANALYZE.)

    Related answer on dba.SE with more on n_distinct: