I am looking to create a Delphi function that ignores the first character of either string if it starts with a Chr(127). This is used in a sort comparison which is extremely slow with 30,000 items when having to copy the text from the 2nd character before the comparison. This is the original function.
function CompareText(const S1, S2: string): Integer; assembler;
asm
PUSH ESI
PUSH EDI
PUSH EBX
MOV ESI,EAX
MOV EDI,EDX
OR EAX,EAX
JE @@0
MOV EAX,[EAX-4]
@@0: OR EDX,EDX
JE @@1
MOV EDX,[EDX-4]
@@1: MOV ECX,EAX
CMP ECX,EDX
JBE @@2
MOV ECX,EDX
@@2: CMP ECX,ECX
@@3: REPE CMPSB
JE @@6
MOV BL,BYTE PTR [ESI-1]
CMP BL,'a'
JB @@4
CMP BL,'z'
JA @@4
SUB BL,20H
@@4: MOV BH,BYTE PTR [EDI-1]
CMP BH,'a'
JB @@5
CMP BH,'z'
JA @@5
SUB BH,20H
@@5: CMP BL,BH
JE @@3
MOVZX EAX,BL
MOVZX EDX,BH
@@6: SUB EAX,EDX
POP EBX
POP EDI
POP ESI
end;
ignores the first character of either string if it starts with a Chr(127).
I can certainly see how "having to copy the text from the 2nd character before the comparison" can get extremely slow.
You don't need to copy anything. In the presence of Chr(127) you just consider the second position in the string as its true start. In code it means incrementing the addresses ESI/EDI and also decrementing the lengths EAX/EDX.
For a better performance don't use BL and BH, but use full 32-bit registers, even if it means preserving an extra register like EBP.
push esi
push edi
push ebx
push ebp
mov esi, eax ; 1st string
test eax, eax ; NULL pointer ?
jz @@0
mov eax, [esi-4] ; LEN
test eax, eax ; Is length 0 ?
jz @@0
cmp byte ptr [esi], 127 ; Is first char Chr(127) ?
jne @@0
inc esi ; Skip Chr(127)
dec eax ; LEN--
@@0: mov edi, edx ; 2nd string
test edx, edx ; NULL pointer ?
jz @@1
mov edx, [edi-4] ; LEN
test edx, edx ; Is length 0 ?
jz @@1
cmp byte ptr [edi], 127 ; Is first char Chr(127) ?
jne @@1
inc edi ; Skip Chr(127)
dec edx ; LEN--
@@1: mov ecx, eax ; Get shorter length to ECX
cmp eax, edx
jbe @@2
mov ecx, edx
@@2: dec ecx
js @@5 ; `JS` bypasses the loop if ECX was initially zero
movzx ebx, byte ptr [esi]
inc esi
movzx ebp, byte ptr [edi]
inc edi
cmp ebx, ebp
je @@2
cmp ebx, 'A'
jb @@3
cmp ebx, 'Z'
ja @@3
or ebx, 32 ; LCase
@@3: cmp ebp, 'A'
jb @@4
cmp ebp, 'Z'
ja @@4
or ebp, 32 ; LCase
@@4: cmp ebx, ebp
je @@2
mov eax, ebx
mov edx, ebp
@@5: sub eax, edx
pop ebp
pop ebx
pop edi
pop esi
I'm sure the loop and especially its LCase part could be further optimized, but I think the main issue about ignoring a leading Chr(127) got resolved...
Your main loop (@@3
) essentially has two parts: one that does an 'as-is' comparison (cmpsb
) and one that does a 'case-insensitive' comparison. Thing is that the latter would make sense if it were the only part in the loop, but, combined with the first part it should only be used for current bytes that are confirmed to be an alphabetical.
I don't know whether any of your strings will include an ASCII code in the range from 91 to 96, but if that is the case then, currently it is possible for your original code to report wrongly on a comparison between any from [a,z] and any from ]Z,a[ due to the UCase happening, and similarly for my first code to report wrongly on a comparison between any from [A,Z] and any from ]Z,a[ due to the LCase happening.
This is a revised code:
...
@@2: dec ecx
js @@4 ; `JS` bypasses the loop if ECX was initially zero
movzx ebx, byte ptr [esi]
inc esi
movzx ebp, byte ptr [edi]
inc edi
; *** part 1 (compare as-is) ***
cmp ebx, ebp
je @@2
; *** part 2 (compare case-insensitively) ***
or ebx, 32 ; LCase
sub ebx, 'a'
cmp ebx, 25
ja @@3 ; It is not an alphabetical
or ebp, 32 ; LCase
sub ebp, 'a'
cmp ebp, 25
ja @@3 ; It is not an alphabetical
cmp ebx, ebp
je @@2 ; Only their case was different
mov eax, ebx
mov edx, ebp
jmp @@4
@@3: movzx eax, byte ptr [esi-1]
movzx edx, byte ptr [edi-1]
@@4: sub eax, edx
pop ebp
pop ebx
pop edi
pop esi
Incidentally, the new code should run faster now that I have applied the technique that Peter mentioned in a comment.