In continuation of my last post - "migration oracle to postgresql invalid byte sequence for encoding “UTF8”: 0x00"
I'm trying to insert into a local PostgreSQL table data from remote Oracle table (via oracle_fdw extension). My Oracle table has a column named street and it has valid string values and sometimes the next invalid (in PostgreSQL) string : ' ' (space).
When I try to copy the column value I get the error I mentioned above and in my last post. I understood that I need to change the oracle data before I insert it to PostgreSQL. I must do it on the fly so I tried to search for oracle decode func in PostgreSQL. I found two solution and I used both of them but I got same error:
1.using select with case :
mydb=>select *,(case when v.street=' ' then null END) from customer_prod v;
ERROR: invalid byte sequence for encoding "UTF8": 0x00
CONTEXT: converting column "street" for foreign table scan of
"customer_prod", row 254148
2.using decode function from orafce extension :
mydb=>select decode(street,' ',null) from customer_prod;
ERROR: invalid byte sequence for encoding "UTF8": 0x00
So, I'm still getting the error. How can I solve this issue?
The error occurs when the values are transferred from Oracle to PostgreSQL, so post-processing won't prevent the error.
For the sake of demonstration, let's create an Oracle table that exhibits the problem:
CREATE TABLE nulltest(
id number(5) CONSTRAINT nulltest_pkey PRIMARY KEY,
val varchar2(10 CHAR)
);
INSERT INTO nulltest VALUES (1, 'schön');
INSERT INTO nulltest VALUES (2, 'bö' || CHR(0) || 'se');
INSERT INTO nulltest VALUES (3, 'egal');
COMMIT;
Let's create a foreign table in PostgreSQL for it:
CREATE FOREIGN TABLE nulltest (
id integer OPTIONS (key 'true') NOT NULL,
val varchar(10)
) SERVER oracle
OPTIONS (table 'NULLTEST');
SELECT * FROM nulltest;
ERROR: invalid byte sequence for encoding "UTF8": 0x00
CONTEXT: converting column "val" for foreign table scan of "nulltest", row 2
Now the easiest thing would be to create a foreign table that filters away the zero characters:
CREATE FOREIGN TABLE filter_nulltest (
id integer OPTIONS (key 'true') NOT NULL,
val varchar(10)
) SERVER oracle
OPTIONS (table '(SELECT id, replace(val, CHR(0), NULL) FROM nulltest)');
SELECT * FROM filter_nulltest;
┌────┬───────┐
│ id │ val │
├────┼───────┤
│ 1 │ schön │
│ 2 │ böse │
│ 3 │ egal │
└────┴───────┘
(3 rows)
Another, less efficient, option would be to create a function that catches and reports bad lines to you so that you can fix them on the Oracle side:
CREATE OR REPLACE FUNCTION get_nulltest() RETURNS SETOF nulltest
LANGUAGE plpgsql AS
$$DECLARE
v_id integer;
n nulltest;
BEGIN
FOR v_id IN SELECT id FROM nulltest
LOOP
BEGIN
SELECT nulltest.* INTO n
FROM nulltest
WHERE id = v_id;
RETURN NEXT n;
EXCEPTION
WHEN OTHERS THEN
RAISE NOTICE 'Caught error % for id=%: %', SQLSTATE, v_id, SQLERRM;
END;
END LOOP;
END;$$;
SELECT * FROM get_nulltest();
NOTICE: Caught error 22021 for id=2: invalid byte sequence for encoding "UTF8": 0x00
┌────┬───────┐
│ id │ val │
├────┼───────┤
│ 1 │ schön │
│ 3 │ egal │
└────┴───────┘
(2 rows)