Summary: I am trying to work with UTF8 text in a TDBMemo to store, retrieve and display Greek letters from/to a BLOB subtype 0 field in an Interbase table. The Greek letters display correctly in the DBMemo but when posted to the database they are garbled, and display garbled when retrieved.
Delphi 10.4, Interbase 2020, Windows 10
I have a TDBMemo connected to a BLOB field in an Interbase table using FireDAC:
TFDConnection --> TFDQuery --> TDataSource --> TDBMemo
This is in a simple test program, with no event handlers attached to any component in the chain. The database table field is BLOB subtype 0 (i.e. binary). The data in the field displays correctly as text in the DBMemo if there are no UTF8 (Greek, Cyrillic, etc. - non-Latin) characters in it, and updates and retrieves the text correctly, and works correctly with any Latin alphabet with non-ASCII characters, like French, but all non-Latin UTF8 characters are garbled in the DBMemo and in the table field after posting changes. If I paste Greek letters into the DBMemo they display correctly (so the DBMemo is handling UTF8 correctly) but are garbled after a post to the table. The table field is a binary BLOB, which should store whatever is posted to it without changing the data. That means the text is not being kept as UTF8 somewhere along the chain from the DBMemo to the FDQuery - the text is being converted to an ANSI string.
I have tried (using the ANSI to UTF8 functions) to stuff the DBMemo.Text into the table field directly - didn't work. I have tried streaming the data from DBMemo.Text into the BLOB field (TStringStream --> TBLOBStream) - didn't work. I have tried using a BLOB subtype 1 (text) field with explicit charset UTF8 instead of the binary BLOB - same result. Examining the data in the Delphi debugger immediately before the post shows the DBMemo.Text is correct. Displaying the text BLOB UTF8 field with isql shows it has been garbled in the table.
There may be a problem with the BLOB text field in the table: if I try to update the field directly in IBConsole isql with
update biblio set greekfield = ἡλιοδρόμος' where id=10793
or
update biblio set greekfield = cast('ἡλιοδρόμος' as varchar(2000) character set UTF8) where id=107937
it fails with "Cannot transliterate character between character sets". But maybe that's just IBConsole being silly.
Since Delphi 10 is fully Unicode compliant - and so is Interbase 2020 - I expected Delphi to be able to handle the Greek UTF8 characters, at the very least when using an explicitly UTF text BLOB table field in Interbase. Obviously I am doing something wrong, but after several hours of trying everything I can think of I still can't get it to work.
Help!
David
David,
Have you set the CharacterSet property to UTF8 in your FDConnection? https://docwiki.embarcadero.com/RADStudio/Sydney/en/Connect_to_InterBase_(FireDAC)
If you do the above and have the greekfield column defined as
BLOB SUB_TYPE 1 CHARACTER SET UTF8
it should be able to do what you want.
By way of example, here is a sample with the command-line isql tool in InterBase. If you run the sample without '-names utf8', you will get "transliteration error". But, if you run it with the '-names utf8' character set name, the character set is set at the connection time, resulting in proper conversion of text input/output to the database from the client application.
connect 'blob_utf8.ib';
drop database;
create database 'blob_utf8.ib';
create table t1 (greekfield blob sub_type 1 character set UTF8);
commit;
set echo on;
show table t1;
/* Add some Greek data */
insert into t1 values ('ἡλιοδρόμος');
commit;
/* retrieve all and display */
select * from t1;
/* retrieve with greek text cast to UTF8 in the filter condition */
select * from t1
where cast (greekfield as varchar(200) character set UTF8) IN ('ἡλιοδρόμος');
quit;