jsonpostgresqlpostgresql-16

Stripping out newline characters from input JSON string when using COPY Command in Postgres


I have a set of code that is importing a JSON file provided into a temp table as follows:

    DROP TABLE IF EXISTS tmp;
    CREATE TEMP  table tmp (c JSONB );
    
    sqltxt:='COPY tmp from '''||filepath||''' with (FORMAT TEXT, DELIMITER ''~'')';

Now in order to process this file sucessfully , I have to manually strip out the newline characters in the incoming file.

I would like to do this in my postgres code by using a function know as regexp_replace

I am struggling to do this

This is my attempt at this :

    DROP TABLE IF EXISTS tmp;
    CREATE TEMP  table tmp (c JSONB );

    -- Populate temp table with incoming JSON
    sqltxt:='COPY translate(tmp, E''\n,'', '''') from '''||filepath||''' with (FORMAT TEXT,     DELIMITER ''~'')';
    EXECUTE sqltxt;

this produceds the following code which when run produces the error:


COPY translate(tmp, E'\n,', '') from 'C:\ChrisDev\Readings\14.json' with (FORMAT TEXT, DELIMITER '~')

[42601] ERROR: syntax error at or near "E'\n,'"

How can I run this code to strip out all newline characters fr


Solution

  • I have found a solution for this through running a docker container

    So I had to do the following:

    1. Run up a Powershell command as Administrator.

    2. Change folder to where the JSON files are stored eg : C:\ChrisDev\Readings

    3. Run the following command to mount the folder into the Docker Container: docker run -it -v .:/files/ ubuntu

    4. Now run the command to strip out Newline characters on the file

      sed -i -z -e 's/\r\n/ /g' '/files/926371_20230120.json'

    Now here what im doing is /\r\n/ is replacing all occurances of Carriage Return (\r) and Line Feed (\n) with an empty space.

    Once this is run, then i can run my JSON file through my program with not issues.