I need to send an UTF-8 encoded response with a Indy 10 HTTP server which includes special characters (like ő
and á
). The original program was written with Indy 9 and there was no problem, but according to Remy Lebeau:
On pre-2009 versions of Delphi, Indy 10 will internally perform a conversion from AnsiString to UTF-16 to Bytes if the specified Ansi and Byte encodings are different. During that conversion, if the Byte encoding is Indy8BitEncoding, UTF-16 codeunits above U+00FF will be converted to '?' characters. In order to send an AnsiString as-is, you have to set the Ansi and Byte encodings to the same TIdTextEncoding object.
But I can't find a way to do this properly. The IOHandler of the HTTP server has no DefStringEncoding property, so I've tried the following conversations with no luck:
AResponseInfo.ContentEncoding:='utf8';
AResponseInfo.ContentType:='text/html';
ss:=TStringStream.Create('ő');
ss.WriteString(' '+AnsiString('ő'));
ss.WriteString(' '+WideString('ő'));
ss.WriteString(' '+AnsiToUtf8('ő'));
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_8Bit, IndyTextEncoding_8Bit)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_Default, IndyTextEncoding_8Bit)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_ASCII, IndyTextEncoding_8Bit)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_UTF16BE, IndyTextEncoding_8Bit)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_UTF16LE, IndyTextEncoding_8Bit)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_UTF7, IndyTextEncoding_8Bit)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_UTF8, IndyTextEncoding_8Bit)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_8Bit, IndyTextEncoding_UTF8)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_Default, IndyTextEncoding_UTF8)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_ASCII, IndyTextEncoding_UTF8)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_UTF16BE, IndyTextEncoding_UTF8)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_UTF16LE, IndyTextEncoding_UTF8)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_UTF7, IndyTextEncoding_UTF8)+' ';
ss.Seek(0, 0);
AResponseInfo.ContentText:=AResponseInfo.ContentText+ReadStringFromStream(ss, -1, IndyTextEncoding_UTF8, IndyTextEncoding_UTF8)+' ';
With this I got the following response:
? ? ?? ? ? ?? ??? ??? o o L' ? ? ? Aµ Aµ A.Â' dz? dz? dz?dz? dz? dz? dz?dz? ⃵⃵âƒ. d" d" e" Aµ Aµ A.Â' o o L'
The closest one is o
but it missing the accent. The L'
seems promising too, since ő
is Ĺ‘
in UTF-8 bytes but it's not exactly the same too.
How can I solve this?
Update
If I set the AResponseInfo.CharSet
to UTF-8
and then set the ContentText
to the desired string in ANSI (not converting to anything) it works.
But now I'm facing another problem, when my ContentText
is already in UTF-8 then the Indy 10 tries to convert it to UTF-8 again. Because I can't set the DefStringEncoding
because it's not available here I can't make the Indy 10 to skip the conversation. The only workaround to this is to convert the UTF-8 string back to ANSI then let the Indy convert it again to UTF-8.
The IOHandler of the HTTP server has no DefStringEncoding property ... I can't set the DefStringEncoding because it's not available here
Yes, it is available. It is a property of the connection's IOHandler
, not of the server's IOHandler
.
Also, you are looking for DefAnsiEncoding
rather than DefStringEncoding
. DefAnsiEncoding
represents the AnsiString
encoding in memory, whereas DefStringEncoding
represents the on-the-wire encoding over the socket (which is handled by AResponseInfo.CharSet
).
Try this:
AResponseInfo.ContentType := 'text/html';
AResponseInfo.CharSet := 'utf-8';
AResponseInfo.ContentText := UTF8Encode('ő');
//handled by TIdHTTPResponseInfo.WriteContent()
//AContext.Connection.IOHandler.DefStringEncoding := IndyTextEncoding_UTF8;
AContext.Connection.IOHandler.DefAnsiEncoding := IndyTextEncoding_UTF8;
That being said, a simpler solution would be to put your UTF-8 content into a TStream
and then use AResponseInfo.ContentStream
instead of AResponseInfo.ContentText
. The TStream
bytes will be transmitted as-is.
AResponseInfo.ContentType := 'text/html';
AResponseInfo.Charset := 'utf-8';
AResponseInfo.ContentStream := TStringStream.Create(UTF8Encode('ő'));