Give a TJsonString object, how can I get a string representation of the JSON?
Given the following example (valid) JSON:
{
"comment":"The quick bröwn fox\r\n\tjumped \"over\" the \\lazy\/ dog\r\nthen 💩 on a log."
}
In readable text, the comment
would look like:
The quick bröwn fox
jumped "over" the \lazy/ dog
then 💩 on a log.
We parse this JSON into an object:
o: TJsonObject;
o := TJSONObject.ParseJSONValue(TEncoding.UTF8.GetBytes(JsonStr), 0, True) as TJsonObject;
We have a JSON object in memory. Now our goal is to convert it into JSON.
How do I have Delphi XE6 return me the corresponding valid JSON string?
TJsonObject.ToString()
The call:
o.ToString
returns the (invalid) JSON:
{"comment":"The quick bröwn fox↵
jumped \"over\" the \lazy/ dog↵
then 💩 on a log."}
The JSON is invalid because:
CR
into \r
LF
into \n
\
into \\
/
into \/
This makes sense because ToString()
is meant to return human-readable text; not valid JSON
TJsonObject.ToJson()
While ToString()
is meant to return human-readable text, ToJSON()
is meant to return valid JSON.
The only problem is that it doesn't exist in Delphi XE6.
Moving on!
TJsonObject.ToBytes()
Populate a byte array using the ToBytes()
method:
// Allocate a buffer to hold the JSON
buffer: TBytes;
SetLength(buffer, o.EstimatedByteSize);
// Populate the buffer, and size it to its actual length
n: Integer;
n := o.ToBytes(buffer, 0); // fill the byte array buffer
SetLength(buffer, n); // size buffer to final size
// Copy byte array to RawByteString so we can see it
s: RawByteString;
SetLength(s, n); // size the raw byte string
Move(buffer[0], s[1], n); // fill the raw byte string
returns the following JSON:
{"comment":"The quick br\u00F6wn fox\r\n\tjumped \"over\" the \\lazy\/ dog\r\nthen \uD83D\uDCA9 on a log."}
While that does technically generate valid JSON, it's needlessly escaping any code point above #127. Not ideal, and not what I wanted, not what I'm asking for, since strings in JSON are allowed to contain "Unicode characters".
And, to fix it all, I'd have to do is: write a JSON parser, and then code to convert JSON to string. Which is the question I'm asking.
ToString()
Knowing that TJsonString.ToString()
will return a string that is almost valid JSON:
"The quick bröwn fox#$D#$A
#9'jumped \"over\" the lazy dog#$D#$A
then 💩 on a log."
I can see that it's pretty close to being valid JSON. We just have to apply some JSON escaping rules:
\r
\n
\t
\
-> leave as-is
"
-> leave as-is
/
-> leave as-is
But I don't know what happens with other control characters under #32.
TJsonString
's protected TStringBuilder
The object maintains a protected buffer of the actual characters in the json:
TJSONString = class(TJSONValue)
protected
FStrBuffer: TStringBuilder;
And these characters are the actual characters they should be in-memory:
'T', 'h', 'e', ' ', 'q', 'u', 'i', 'c', 'k', ' ', 'b', 'r', 'ö', 'w', 'n', ' ', 'f', 'o', 'x', #$D, #$A,
#9, 'j', 'u', 'm', 'p', 'e', 'd', ' ', '"', 'o', 'v', 'e', 'r', '"', ' ', 't', 'h', 'e', ' ', '\', 'l', 'a', 'z', 'y', '/', ' ', 'd', 'o', 'g', #$D, #$A,
't', 'h', 'e', 'n', ' ', #$D83D, '�', ' ', 'o', 'n', ' ', 'a', ' ', 'l', 'o', 'g', '.', #0
So, if I can reach inside the TJsonString
, I can suck out the correct characters:
type
TJsonStringFriend = class(TJsonString);
var
s: UnicodeString;
s := TJsonStringFriend(AJsonString).FStrBuffer.ToString;
Which returns exactly what a UnicodeString
of the string should contain:
The quick bröwn fox'#$D#$A#9'jumped "over" the \lazy/ dog'#$D#$A'then 💩 on a log.
Now, all I have to do is escape the string back into JSON:
var
i: Integer;
ch: WideChar;
for i := 1 to Length(s) do
begin
ch := s[i];
case ch of
'"': Result := Result + '\"';
'\': Result := Result + '\\';
'/': Result := Result + '\/';
#$8: Result := Result + '\b';
#$c: Result := Result + '\f';
#$a: Result := Result + '\n';
#$d: Result := Result + '\r';
#$9: Result := Result + '\t';
else
if (ch < WideChar(32)) then
Result := Result + '\u'+IntToHex(Ord(ch), 4)
else
Result := Result + ch;
end;
end;
But, in order to use that, we have to do everything:
function JsonValueToJSON(JsonValue: TJSONValue; Indentation: string=''): UnicodeString;
var
jsonArray: TJSONArray;
jsonObject: TJSONObject;
pair: TJSONPair;
i: Integer;
s: UnicodeString;
ch: WideChar;
begin
if JsonValue is TJSONArray then
begin
jsonArray := JsonValue as TJSONArray;
Result := '[' + sLineBreak;
for i := 0 to jsonArray.Count-1 do
begin
Result := Result + Indentation + ' ' + JsonValueToJSON(jsonArray.Items[i], Indentation + ' ');
if i < jsonArray.Count-1 then
Result := Result + ',';
Result := Result + sLineBreak;
end;
Result := Result + Indentation + ']';
end
else if JsonValue is TJSONObject then
begin
jsonObject := JsonValue as TJSONObject;
Result := '{' + sLineBreak;
for i := 0 to jsonObject.Count-1 do
begin
pair := jsonObject.Pairs[i];
Result := Result + Indentation+' "'+pair.JsonString.Value+'": '+JsonValueToJSON(pair.JsonValue, Indentation+ ' ');
if i < jsonObject.Count-1 then
Result := Result + ',';
Result := Result + sLineBreak;
end;
Result := Result + Indentation + '}';
end
else if JsonValue is TJsonString then
begin
// Delphi doesn't know how to emit valid JSON; we'll do it ourselves
Result := '"';
//s := (JsonValue as TJsonString).ToString;
s := TJsonStringFriend(JsonValue as TJsonString).FStrBuffer.ToString;
for i := 1 to Length(s) do
begin
ch := s[i];
case ch of
'"': Result := Result + '\"';
'\': Result := Result + '\\';
'/': Result := Result + '\/';
#$8: Result := Result + '\b';
#$c: Result := Result + '\f';
#$a: Result := Result + '\n';
#$d: Result := Result + '\r';
#$9: Result := Result + '\t';
else
if (ch < WideChar(32)) then
Result := Result + '\u'+IntToHex(Ord(ch), 4)
else
Result := Result + ch;
end;
end;
Result := Result+'"';
end
else
begin
// JsonValue is TJSONNumber, TJSONTrue, TJSONFalse, TJSONNull
// I trust those know how to serialize themselves correctly into JSON
Result := JsonValue.ToString;
end;
end;
Except, now I'm in a situation where I hope I got it all correct.
Surely this can't be what's intended? A 6 hour rabbit-hole of the internal minutia of System.JSON
, and having to re-invent the wheel.
How to convert TJsonValue
into a JSON string?
Solution is to roll your own.
class function TJsonHelper.ToJSON(const AJsonValue: TJsonValue): UnicodeString;
begin
{
We have to do this most obvious thing ourselves, since Delphi gets it wrong.
//WRONG: returns a human-readable string, but invalid JSON (e.g. does not convert CRLF into \r\n)
Result := o.ToString;
//INVALID: not defined in XE6
Result := o.ToJSON;
//WRONG: encodes everything above 128 into escaped \u0083
SetLength(buffer, o.EstimatedByteSize);
n := o.ToBytes(buffer, 0);
SetLength(buffer, n);
}
if AJsonValue = nil then
begin
Result := '';
Exit;
end;
Result := PrettifyJsonValue(AJsonValue, '');
end;
And then the actual guts are private in a PrettifyJsonValue() helper function:
type
// Crack open the JsonString, and feast on the tasty string builder inside.
TJsonStringFriend = class(TJsonString)
end;
function PrettifyJsonValue(JsonValue: TJSONValue; Indentation: string=''): UnicodeString;
var
jsonArray: TJSONArray;
jsonObject: TJSONObject;
pair: TJSONPair;
i: Integer;
s: UnicodeString;
ch: WideChar;
sKey, sValue: UnicodeString;
begin
TConstraints.NotNull(JsonValue);
if JsonValue is TJSONArray then
begin
jsonArray := JsonValue as TJSONArray;
// Workaround a Delphi System.JSON bug where it cannot parse an empty array such as:
// "cities": [ ]
if jsonArray.Count <= 0 then
begin
Result := '[]';
Exit;
end;
Result := '['+ sLineBreak;
for i := 0 to jsonArray.Count-1 do
begin
Result := Result + Indentation + ' ' + PrettifyJsonValue(jsonArray.Items[i], Indentation + ' ');
if i < jsonArray.Count-1 then
Result := Result + ',';
Result := Result + sLineBreak;
end;
Result := Result + Indentation + ']';
end
else if JsonValue is TJSONObject then
begin
jsonObject := JsonValue as TJSONObject;
Result := '{' + sLineBreak;
for i := 0 to jsonObject.Count-1 do
begin
pair := jsonObject.Pairs[i];
sKey := pair.JsonString.Value;
sValue := PrettifyJsonValue(pair.JsonValue, Indentation+' ');
Result := Result + Indentation+' "'+sKey+'": '+sValue;
if i < jsonObject.Count-1 then
Result := Result + ',';
Result := Result + sLineBreak;
end;
Result := Result + Indentation + '}';
end
else if JsonValue.ClassType = TJsonString then //TJsonNumber descends from TJsonString
begin
// Delphi doesn't know how to emit valid JSON; we'll do it ourselves
//SURPRISE: TJsonNumber descends from TJsonString. No, i'm not joking. Not even a little.
Result := '"';
// s := (JsonValue as TJsonString).ToString;
s := TJsonStringFriend(JsonValue as TJsonString).FStrBuffer.ToString;
for i := 1 to Length(s) do
begin
ch := s[i];
case ch of
'"': Result := Result + '\"';
'\': Result := Result + '\\';
'/': Result := Result + '\/';
#$8: Result := Result + '\b';
#$c: Result := Result + '\f';
#$a: Result := Result + '\n';
#$d: Result := Result + '\r';
#$9: Result := Result + '\t';
else
if (ch < WideChar(32)) then
Result := Result + '\u'+IntToHex(Ord(ch), 4)
else
Result := Result + ch;
end;
end;
Result := Result+'"';
end
else
begin
// JsonValue is TJSONNumber, TJSONTrue, TJSONFalse, TJSONNull
// I trust those know how to serialize themselves correctly into JSON
Result := JsonValue.ToString;
end;
end;
Delphi also has a well-known bug in it's parser, where it is unable to parse valid JSON.
You need to run your json string through a regex search-replace: