I have a system where I need to compare bytea
values (case insensitive in this case). For this purpose, I use encode
to convert the bytea
values into strings which I then compare. This works as long as there is no unicode character in the array.
select
encode(cid, 'hex'),
cid = '\x3030386132323035'::bytea a,
encode(cid, 'escape') = encode('\x3030386132323035'::bytea, 'escape') b,
encode(cid, 'escape') like encode('\x3030386132323035'::bytea, 'escape') c,
encode(cid, 'escape') ilike encode('\x3030386132323035'::bytea, 'escape') d
from history
encode | a | b | c | d |
---|---|---|---|---|
3030386132323035 | true | true | true | true |
When there are unicode characters in bytea
, however, like
/ilike
do not work
select
encode(cid, 'hex'),
cid = '\x303030d0a2d095d0a1d0a231'::bytea a,
encode(cid, 'escape') = encode('\x303030d0a2d095d0a1d0a231'::bytea, 'escape') b,
encode(cid, 'escape') like encode('\x303030d0a2d095d0a1d0a231'::bytea, 'escape') c,
encode(cid, 'escape') ilike encode('\x303030d0a2d095d0a1d0a231'::bytea, 'escape') d
from history
encode | a | b | c | d |
---|---|---|---|---|
303030d0a2d095d0a1d0a231 | true | true | false | false |
Can anyone explain what is causing this behavior?
That's because of the special meaning of backslashes in LIKE
, ILIKE
and
SIMILAR TO
syntax:
The default escape character is the backslash but a different one can be selected by using the
ESCAPE
clause. To match the escape character itself, write two escape characters.
Which doesn't play well with escape format:
escape
Theescape
format converts zero bytes and bytes with the high bit set into octal escape sequences (\
nnn
), and it doubles backslashes. Other byte values are represented literally.
On a simpler example:
select '\a' like '\a';--yields False
You can fix that by disabling or changing the default escape
character: demo at db<>fiddle
select cid as as_default
, encode(cid, 'escape') as as_escape
, cid = '\x303030d0a2d095d0a1d0a231'::bytea as a
, encode(cid,'escape')
= encode('\x303030d0a2d095d0a1d0a231'::bytea,'escape')
as b
, encode(cid,'escape')
like encode('\x303030d0a2d095d0a1d0a231'::bytea,'escape')
as c
, encode(cid,'escape')
ilike encode('\x303030d0a2d095d0a1d0a231'::bytea,'escape')
as d
, encode(cid,'escape')
like encode('\x303030d0a2d095d0a1d0a231'::bytea,'escape')
escape '' --escaping disabled
as e
, encode(cid,'escape')
ilike encode('\x303030d0a2d095d0a1d0a231'::bytea,'escape')
escape '!' --escape character changed from default `\` to `!`
as f
from history
where cid='\x303030d0a2d095d0a1d0a231'::bytea;
as_default | as_escape | a | b | c | d | e | f |
---|---|---|---|---|---|---|---|
\x303030d0a2d095d0a1d0a231 | 000\320\242\320\225\320\241\320\2421 | T | T | f | f | T | T |
Simpler test:
select '\a' like '\a'; --False
select '\a' like '\a' escape ''; --True
select '\a' like '\a' escape '!';--True