I have the following SQL query that (appears to have some kind of whitespace at the end)is returning data in my XML output and I am trying to remove this upfront in the SQL from being shown in the XML output:
SELECT 'H' KEY ,
REPLACE(REPLACE(LN.VENDOR_PRODUCT_NUM, CHR(8237), ''), CHR(8236), '') vendor_catalog_number
FROM MyTable
SQL results (doesn't appear to show the whitespace at the end):
KEY vendor_catalog_number
H CUR251636
XML output:
<?xml version = '1.0' encoding = 'utf-8'?>
<DATA_DS>
<G_1>
<KEY>H</KEY>
<VENDOR_CATALOG_NUMBER><0x202d>CUR251636<0x202c></VENDOR_CATALOG_NUMBER>
</G_1>
</DATA_DS>
As a result the data is being rendered in my Oracle eText report template as ?CUR251636? (with the question marks around the string.)
I also tried using just for Linefeeds and Carriage returns as below but that still did not work:
REPLACE(REPLACE(LN.VENDOR_PRODUCT_NUM, CHR(13)), CHR(10)) vendor_catalog_number
What other Oracle SQL function can I use to remove this whitespace/unicode characters?
EDIT:
SELECT DUMP(LN.VENDOR_PRODUCT_NUM, 1016) FROM MyTable
Results: Typ=1 Len=15 CharacterSet=AL32UTF8: e2,80,ad,43,55,52,32,35,31,36,33,36,e2,80,ac
SELECT DUMP(LN.VENDOR_PRODUCT_NUM, 1016) FROM MyTable
Results: E280AD435552323531363336E280AC
Those are POP DIRECTIONAL FORMATTING and LEFT-TO-RIGHT OVERRIDE. No idea why they are there; perhaps from some conversion between writing systems on insert?
Anyway... it seems you need the AL16UTF16 version of the codepoints, not the AL32UTF8 versions (not sure why, hopefully someone else can explain it...). Your code with CHR()
doesn't replace anything:
with cte as (
SELECT 'H' KEY,
REPLACE(REPLACE(LN.VENDOR_PRODUCT_NUM, CHR(8237), ''), CHR(8236), '') vendor_catalog_number
FROM MyTable LN
)
select KEY, vendor_catalog_number,
DUMP(vendor_catalog_number, 1016) as dmp,
RAWTOHEX(UTL_RAW.CAST_TO_RAW(vendor_catalog_number)) as hex
from cte
KEY | VENDOR_CATALOG_NUMBER | DMP | HEX |
---|---|---|---|
H | CUR251636 | Typ=1 Len=15 CharacterSet=AL32UTF8: e2,80,ad,43,55,52,32,35,31,36,33,36,e2,80,ac | E280AD435552323531363336E280AC |
If you use the UNISTR()
function (or add the USING NCHAR_CS
option to CHR()
then it does replace those:
with cte as (
SELECT 'H' KEY,
REPLACE(REPLACE(LN.VENDOR_PRODUCT_NUM, UNISTR('\202d'), null), UNISTR('\202c'), null) vendor_catalog_number
FROM MyTable LN
)
select KEY, vendor_catalog_number,
DUMP(vendor_catalog_number, 1016) as dmp,
RAWTOHEX(UTL_RAW.CAST_TO_RAW(vendor_catalog_number)) as hex
from cte
KEY | VENDOR_CATALOG_NUMBER | DMP | HEX |
---|---|---|---|
H | CUR251636 | Typ=1 Len=9 CharacterSet=AL32UTF8: 43,55,52,32,35,31,36,33,36 | 435552323531363336 |
... producing just your desired 9-character string 'CUR251636'
.
So instead of your current query:
SELECT 'H' KEY ,
REPLACE(REPLACE(LN.VENDOR_PRODUCT_NUM, CHR(8237), ''), CHR(8236), '') vendor_catalog_number
FROM MyTable LN
you can do:
SELECT 'H' KEY,
REPLACE(REPLACE(LN.VENDOR_PRODUCT_NUM, UNISTR('\202d'), null), UNISTR('\202c'), null) vendor_catalog_number
FROM MyTable LN