I was tracing some character encoding code and found this in System.SysUtils
function TEncoding.GetBytes(const S: string): TBytes;
var
Len: Integer;
begin
Len := GetByteCount(S);
SetLength(Result, Len);
GetBytes(S, Low(S), Length(S), Result, 0, Low(S));
end;
What does Low(S)
do here?
The overloaded GetBytes()
that gets called here is:
function TEncoding.GetBytes(const S: string; CharIndex, CharCount: Integer;
const Bytes: TBytes; ByteIndex: Integer; const StringBaseIndex: Integer): Integer;
This has a somewhat cryptic comment:
// StringBaseIndex : Low(string) on caller's context
The built-in function LOW()
returns the lower index of an indexable item. For a STRING
, this is always 1 in the most recent version, but a few versions back, this would be 0 on mobile platforms, so the use of LOW()
would allow you to make code that would compile in all platforms, regardless of target.
For clarity, you should always use LOW()
(and HIGH()
) when iterating over the valid indices of an indexable item, even though you might know (or think you know) what the LOW
and/or HIGH
value is. It's just safer to let the compiler determine it for you.
I have seen constructs like this:
FOR I:=0 TO HIGH(Arr) DO ...
and my question is always, why they use HIGH()
and then omits LOW()
- it makes no sense to me.
So - my advice is to always do this:
FOR I:=LOW(Arr) TO HIGH(Arr) DO ...
regardless of what you know (or think you know - qua the change of LOW(STRING)
from 0 to 1 in recent time).
And, when iterating over all characters in a STRING
, either use:
FOR C IN Str DO ...
or:
FOR I:=LOW(Str) TO HIGH(Str) DO ...
HIGH(Str)
is the same as LENGTH(Str)
in modern compilers, but when LOW(Str)
was 0, then HIGH(Str)
equals LENGTH(Str)-1
.
Using LOW()
and HIGH()
also clearly illustrates your intent, namely that you are iterating over all the valid indices of the item.