databasepostgresqlperformancequery-optimizationdatabase-indexes

Optimal PostgreSQL single/multicolumn indexes?


I'm trying to determine the best indexes for a table in PostgreSQL. I expect on the order of ~10b rows and ~10TB data.

The table has 5 main columns used for filtering and/or sorting

CREATE TABLE table (
  filter_key_1 AS BYTEA,    -- filtering
  filter_key_2 AS BYTEA,    -- filtering
  filter_key_3 AS BYTEA,    -- filtering
  sort_key_1   AS INTEGER,  -- filtering & sorting
  sort_key_2   AS INTEGER   -- filtering & sorting
)

Queries will be:

SELECT * FROM table WHERE filter_key_1 = $1 ORDER BY sort_key_1, sort_key_2 LIMIT 15;
SELECT * FROM table WHERE filter_key_2 = $1 ORDER BY sort_key_1, sort_key_2 LIMIT 15;
SELECT * FROM table WHERE filter_key_3 = $1 ORDER BY sort_key_1, sort_key_2 LIMIT 15;

SELECT * FROM table WHERE filter_key_1 = $1 AND sort_key_1 <= $2 AND sort_key_2 <= $3 ORDER BY sort_key_1, sort_key_2 LIMIT 15;
SELECT * FROM table WHERE filter_key_2 = $1 AND sort_key_1 <= $2 AND sort_key_2 <= $3 ORDER BY sort_key_1, sort_key_2 LIMIT 15;
SELECT * FROM table WHERE filter_key_3 = $1 AND sort_key_1 <= $2 AND sort_key_2 <= $3 ORDER BY sort_key_1, sort_key_2 LIMIT 15;

What are the ideal indexes for the table? How large will they get with ~10b rows? How much will they limit write throughput?

Edit

What if I want to add additional queries such as below. Would the indexes from above hold-up?

SELECT * FROM table WHERE filter_key_1 = $1 AND filter_key_2 = $2 ORDER BY sort_key_1, sort_key_2 LIMIT 15;
SELECT * FROM table WHERE filter_key_1 = $1 AND filter_key_2 = $2 AND filter_key_3 = $3 ORDER BY sort_key_1, sort_key_2 LIMIT 15;
-- ...

IO requirements

The workload is heavy read, low write.

Read speed is important. Write speed is less important (can live with up-to 3 seconds per insert)


Solution

  • Since you need to run these queries all the time, you will have to optimize them as much as possible. That would mean

    CREATE INDEX ON tab (filter_key_1, sort_key_1, sort_key_2);
    CREATE INDEX ON tab (filter_key_2, sort_key_1, sort_key_2);
    CREATE INDEX ON tab (filter_key_3, sort_key_1, sort_key_2);
    

    Together, these indexes should be substantially larger than your table.