postgresqlpostgresql-11

Bulk Inserts/Updates Are Slow Using On Conflict with JSONB


I'm doing a bulk insert from a giant csv file that needs to turn into a both a relational and JSONB object at the same time into a table. Problem: the inserts need to do ether an insert or update. If it's an update. The column needs to append the JSON object to the row. The current setup I have has individual INSERT/UPDATE calls and of course, it's horribly slow.

Example Import Command I'm Running:

INSERT INTO "trade" ("id_asset", "trade_data", "year", "month") VALUES ('1925ad09-51e9-4de4-a506-9bccb7361297', '{"28":{"open":2.89,"high":2.89,"low":2.89,"close":2.89}}', 2017, 3)  ON CONFLICT ("year", "month", "id_asset") DO
UPDATE SET "trade_data" = "trade"."trade_data" || '{"28":{"open":2.89,"high":2.89,"low":2.89,"close":2.89}}' WHERE "trade"."id_asset" = '1925ad09-51e9-4de4-a506-9bccb7361297' AND "trade"."year" = 2017 AND "trade"."month" = 3;

I've tried wrapping my script in a BEGIN and COMMIT, but it didn't improve performance at all and I tried a few configurations, but it didn't seem to help.

\set autocommit off;
set schema 'market';
\set fsync off;
\set full_page_writes off;
SET synchronous_commit TO off;
\i prices.sql

This whole thing is extremely slow, and I'm not sure how to re-write the query without loading a ton of data into RAM using my program just to spit out a large INSERT/UPDATE command efficiently for Postgres to read. Since related data could be a million rows or another file all together to properly generate a JSON w/ out losing current JSON data that's already in the database.


Solution

  • Simply scp moved my large SQL file into the Postgres server and re-ran the commands inside psql and now the migration is extremely faster.