postgresqljqplpgsqlpostgresql-copy

How do I pass JSON to a Postgres COPY tmp FROM PROGRAM command


I'm trying to use Postgres (PL/pgSQL) to communicate with a shell script that expects JSON data. A simplified example of what I'm trying to do is

CREATE OR REPLACE FUNCTION json_func(IN json_in JSONB, OUT json_out JSONB)
LANGUAGE plpgsql
AS $code_block$
BEGIN
CREATE TEMPORARY TABLE tmp (json_tmp JSONB);
EXECUTE format($bash$ COPY tmp FROM PROGRAM 'echo ''%s'' | jq . ' $bash$, json_in);
SELECT json_tmp FROM tmp INTO json_out;
DROP TABLE tmp;
END;
$code_block$;

When I run

SELECT json_func('{"key1": "val1", "key2": "val2"}'::JSONB);

I get

ERROR:  invalid input syntax for type json
DETAIL:  The input string ended unexpectedly.
CONTEXT:  JSON data, line 1: {
COPY tmp, line 1, column json_tmp: "{"
SQL statement " COPY tmp FROM PROGRAM 'echo ''{"key1": "val1", "key2": "val2"}'' | jq . ' "
PL/pgSQL function json_func(jsonb) line 4 at EXECUTE

I've tried various ways of escaping the single quotes needed to surround the JSON input, but no luck.


Solution

  • You're getting the error because COPY expects one value per line, while jq pretty-prints by default, splitting the single json structure into multiple lines. As already pointed out by @Francisco Puga, you can disable that:

    --compact-output / -c:

    By default, jq pretty-prints JSON output. Using this option will result in more compact output by instead putting each JSON object on a single line.

    Otherwise, this

    '{"key1": "val1", "key2": "val2"}'
    

    Gets turned by jq into this

    {
        "key1": "val1",
        "key2": "val2"
    }
    

    And due to the line-separation thing, copy thinks you're trying to do something this:

    insert into tmp(json_tmp)values 
     ('{')
    ,('"key1": "val1",')
    ,('"key2": "val2"')
    ,('}');
    

    Hence, the error complaining about the single opening curly bracket. It seems to work fine if you enable compaction:

    CREATE OR REPLACE FUNCTION json_func(IN json_in JSONB, OUT json_out JSONB)
    LANGUAGE plpgsql
    AS $code_block$
    BEGIN
    CREATE TEMPORARY TABLE tmp (json_tmp JSONB);
    EXECUTE format($dynsql$ COPY tmp FROM PROGRAM $bash$echo %1$L |jq -c . $bash$ 
                   $dynsql$, json_in);
    SELECT json_tmp FROM tmp INTO json_out;
    DROP TABLE tmp;
    END;
    $code_block$;
    

    If that's an attempt to pretty-print, Postgres offers a built-in jsonb_pretty():
    demo at db<>fiddle

    select jsonb_pretty('{"abc":123}');
    
    jsonb_pretty
    {
        "abc": 123
    }

    If it's for validation, In version 16+ you can check that with 'abc' IS JSON or pg_input_is_valid('abc','jsonb'):

    select 'abc' is json;                          --false
    select pg_input_is_valid('abc','json');        --false
    select '{"abc":123}' is json;                  --true
    select pg_input_is_valid('{"abc":123}','json');--true