delphiutf-8delphi-7

Detecting 'text' file type (ANSI vs UTF-8)


I wrote an application (a psychological testing exam) in Delphi (7) which creates a standard text file - ie the file is of type ANSI.

Someone has ported the program to run on the Internet, probably using Java, and the resulting text file is of type UTF-8.

The program which reads these results files will have to read both the files created by Delphi and the files created via the Internet.

Whilst I can convert the UTF-8 text to ANSI (using the cunningly named function UTF8ToANSI), how can I tell in advance which kind of file I have?

Seeing as I 'own' the file format, I suppose the easiest way to deal with this would be to place a marker within the file at a known position which will tell me the source of the program (Delphi/Internet), but this seems to be cheating.

Thanks in advance.


Solution

  • If the UTF file begins with the UTF-8 Byte-Order Mark (BOM), this is easy:

    function UTF8FileBOM(const FileName: string): boolean;
    var
      txt: file;
      bytes: array[0..2] of byte;
      amt: integer;
    begin
    
      FileMode := fmOpenRead;
      AssignFile(txt, FileName);
      Reset(txt, 1);
    
      try
        BlockRead(txt, bytes, 3, amt);
        result := (amt=3) and (bytes[0] = $EF) and (bytes[1] = $BB) and (bytes[2] = $BF);
      finally    
        CloseFile(txt);
      end;
    
    end;
    

    Otherwise, it is much more difficult.