delphitstringlist

Searching a sorted TStringList for an entry with a prefix (StartsText)


I have a TStringList which is sorted and contains unique filenames. The list can be of any size (so it can be hundreds of thousands of entries). I want to check to see if any of the entries start with a particular string (i.e. if the files are in a sub-folder). It's easy enough serially scanning the list and using StartsText, but that isn't an ideal solution.

Using the TStringList.Find() code as a starting point, I've created a function which I think is the solution, but I want to be sure. Don't worry about the following not being a member of the class (FList is the TStringList instance being searched), and StartsFilename works the same way as StartsText:

  function ShortcutFind(const S: string): Boolean;
  var
    L, H, I, C: Integer;
  begin
    Result := False;
    L := 0;
    H := FList.Count - 1;
    while L <= H do begin
      I := (L + H) shr 1;

      if TFilenameUtils.StartsFilename(FList[I], aFolder) then begin
        Result:=TRUE;
        Exit;
      end;

      C := FList.CompareStrings(FList[I], S);
      if C < 0 then
        L := I + 1
      else begin
        H := I - 1;
        if C = 0 then begin
          Result := True;
          if FList.Duplicates <> dupAccept then L := I;
        end;
      end;
    end;
  end;

Basically, the only real change is that it does the check before moving onto the next entry to compare.

Note that switching from TStringList is not an option.

Would this method work?

Thanks


Solution

  • If TFilenameUtils.StartsFilename is the same as StartsText (and your first paragraph suggests it might be), then you can do the whole function in one statement by using TStringList.Find instead of copying it:

    var
      I: Integer;
    begin
      Assert(not FList.CaseSensitive);
      Result := FList.Find(S, I) or ((I < FList.Count) and StartsText(S, FList[I]));
    end;
    

    That should work because when Find fails, it still tells you the index of where the desired string would have appeared in the list. When you search for your prefix string, its location will be before any other strings that start with that prefix, so if there are any strings with that prefix, they'll appear immediately after the hypothetical location of the prefix itself.


    If you want to keep your current code, then you can simplify it by removing the conditional that checks C = 0. That condition should never occur, unless your StartsFilename function is broken. But, if the function really is broken and C can be zero, then you can at least stop executing the loop at that point since you've found what you were looking for. Either way, you don't need to check Duplicates since your function doesn't have the same requirement as Find does to return the index of the found item.