I research declarative partitioning and prepared such code for it:
create table _my_message
(
id uuid not null,
chat_id uuid constraint _my_message_chat_fkey references _chat,
...
CONSTRAINT _my_message_id_chat_id_pk PRIMARY KEY (id, chat_id)
) PARTITION BY HASH (chat_id);
CREATE TABLE _my_message_0 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 0);
CREATE TABLE _my_message_1 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 1);
CREATE TABLE _my_message_2 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 2);
CREATE TABLE _my_message_3 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 3);
CREATE TABLE _my_message_4 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 4);
CREATE TABLE _my_message_5 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 5);
CREATE TABLE _my_message_6 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 6);
CREATE TABLE _my_message_7 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 7);
CREATE TABLE _my_message_8 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 8);
CREATE TABLE _my_message_9 PARTITION OF _my_message FOR VALUES WITH (MODULUS 10, REMAINDER 9);
CREATE INDEX ON _my_message (chat_id);
It is not possible to turn a regular table into a partitioned table or vice versa.
INSERT INTO _my_message as _d
SELECT c.*
FROM _message c
ORDER BY created_at desc;
After executing the queries I checked my partition-tables (_my_message_0
, _my_message_1
,..., _my_message_9
) and now all of them have data considering hash of chat_id
as it expected. But I noticed that the table _my_message
has data too (sum of all data from partition-tables).
But if I execute explain analyse select * from _my_message
I got such query plan:
Append (cost=0.00..8494.87 rows=151658 width=326) (actual time=0.014..38.301 rows=151658 loops=1)
-> Seq Scan on _my_message_0 _my_message_1 (cost=0.00..176.85 rows=2485 width=484) (actual time=0.014..1.258 rows=2485 loops=1)
-> Seq Scan on _my_message_1 _my_message_2 (cost=0.00..182.53 rows=2653 width=468) (actual time=0.004..0.819 rows=2653 loops=1)
-> Seq Scan on _my_message_2 _my_message_3 (cost=0.00..337.35 rows=7335 width=291) (actual time=0.007..1.680 rows=7335 loops=1)
-> Seq Scan on _my_message_3 _my_message_4 (cost=0.00..197.85 rows=3085 width=439) (actual time=0.008..0.655 rows=3085 loops=1)
-> Seq Scan on _my_message_4 _my_message_5 (cost=0.00..185.56 rows=2756 width=462) (actual time=0.007..0.636 rows=2756 loops=1)
-> Seq Scan on _my_message_5 _my_message_6 (cost=0.00..4830.22 rows=98622 width=309) (actual time=0.003..17.484 rows=98622 loops=1)
-> Seq Scan on _my_message_6 _my_message_7 (cost=0.00..1139.36 rows=22536 width=322) (actual time=0.008..3.818 rows=22536 loops=1)
-> Seq Scan on _my_message_7 _my_message_8 (cost=0.00..176.00 rows=2500 width=481) (actual time=0.011..0.669 rows=2500 loops=1)
-> Seq Scan on _my_message_8 _my_message_9 (cost=0.00..321.55 rows=6855 width=299) (actual time=0.003..1.465 rows=6855 loops=1)
-> Seq Scan on _my_message_9 _my_message_10 (cost=0.00..189.31 rows=2831 width=460) (actual time=0.004..0.715 rows=2831 loops=1)
Planning Time: 3.195 ms
Execution Time: 43.303 ms
What does it mean? Is it just data from all partition-tables (_my_message_0
, _my_message_1
,..., _my_message_9
)? Does the partitioned table _my_message
have data itself?
And one more question: are there some options for partitioning without huge migration from source table (message
)? I am afraid that during the migration process a new message may appear in the source table message
, which will be lost and will not be migrated to the partitioned table. I am also afraid that with such a migration I will not have a chance to roll back the release in case there are any critical problems.
The partitioned table itself is always empty. You can see it as a kind of view on the UNION ALL
of all partitions.
Changing from a non-partitioned table to a partitioned one means shoveling a lot of data around. That usually means an extended down time. The only feasible way to avoid that is to use logical replication, where the subscriber has a partitioned table.